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Abstract 

The  relationship  between  Popper  spaces  (conditional  probability  spaces  that  sat¬ 
isfy  some  regularity  conditions),  lexicographic  probability  systems  (LPS’s)  [Blurne, 
Brandenburger,  and  Dekel  1991a;  Blurne,  Brandenburger,  and  Dekel  1991b],  and 
nonstandard  probability  spaces  (NPS’s)  is  considered.  If  countable  additivity  is 
assumed,  Popper  spaces  and  a  subclass  of  LPS’s  are  equivalent;  without  the  as¬ 
sumption  of  countable  additivity,  the  equivalence  no  longer  holds.  If  the  state  space 
is  finite,  LPS’s  are  equivalent  to  NPS’s.  However,  if  the  state  space  is  infinite,  NPS’s 
are  shown  to  be  more  general  than  LPS’s. 

JEL  classification  numbers:  C70;  D80;  D81; 


1  Introduction 

Probability  is  certainly  the  most  commonly-used  approach  for  representing  uncertainty 
and  conditioning  the  standard  way  of  updating  probabilities  in  the  light  of  new  informa¬ 
tion.  Unfortunately,  there  is  a  well-known  problem  with  conditioning:  Conditioning  on 

*The  work  was  supported  in  part  by  NSF  under  grants  IRI-96-25901,  IIS-0090145,  and  CTC-0208535, 
by  ONR  under  grant  N00014-02-1-0455,  and  by  the  DoD  Multidisciplinary  University  Research  Initia¬ 
tive  (MURI)  program  administered  by  the  ONR  under  grant  N00014-01-1-0795.  A  preliminary  version 
appeared  in  the  Proceedings  of  the  Eighth  Conference  on  Theoretical  Aspects  of  Rationality  and  Knowl¬ 
edge,  2001  [Halpern  2001].  This  version  includes  detailed  proofs  and  more  discussion  and  more  examples; 
in  addition,  the  material  in  Section  6  (on  independence)  is  new. 
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events  of  measure  0  is  not  defined.  That  makes  it  unclear  how  to  proceed  if  an  agent 
learns  something  to  which  she  initially  assigned  probability  0.  Although  consideration  of 
events  of  measure  0  may  seem  to  be  of  little  practical  interest,  it  turns  out  to  play  a  critical 
role  in  game  theory,  particularly  in  the  analysis  of  strategic  reasoning  in  extensive-form 
games  and  in  the  analysis  of  weak  dominance  in  normal-form  games  (see,  for  example, 
[Battigalli  1996;  Battigalli  and  Siniscalchi  2002;  Blume,  Brandenburger,  and  Dekel  1991a; 
Blurne,  Brandenburger,  and  Dekel  1991b;  Brandenburger,  Friedenberg,  and  Keisler  2008; 
Fudenberg  and  Tirole  1991;  Hammond  1994;  Hammond  1999;  Kohlberg  and  Reny  1997; 
Kreps  and  Wilson  1982;  Myerson  1986;  Selten  1965;  Selten  1975]).  It  also  arises  in  the 
analysis  of  conditional  statements  by  philosophers  (see  [Adams  1966;  McGee  1994]),  and 
in  dealing  with  nonmonotonicity  in  Artificial  Intelligence  (see,  for  example,  [Lehmann 
and  Magidor  1992]). 

There  have  been  various  attempts  to  deal  with  the  problem  of  conditioning  on  events 
of  measure  0.  Perhaps  the  best  known  involves  conditional  probability  spaces  (GPS’s). 
The  idea,  which  goes  back  to  Popper  [1934,  1968]  and  de  Finetti  [1936],  is  to  take  as 
primitive  not  probability,  but  conditional  probability.  If  p  is  a  conditional  probability 
measure  on  a  space  W,  then  p(V  |  U )  may  still  be  undefined  for  some  pairs  V  and  I/,  but 
it  is  also  possible  that  p(V  |  U )  is  defined  even  if  p(U  \  W)  =  0.  A  second  approach,  which 
goes  back  to  at  least  Robinson  [1973]  and  has  been  explored  in  the  economics  literature 
[Hammond  1994;  Hammond  1999],  the  AI  literature  [Lehmann  and  Magidor  1992;  Wilson 
1995],  and  the  philosophy  literature  (see  [McGee  1994]  and  the  references  therein)  is  to 
consider  nonstandard  probability  spaces  (NPS’s),  where  there  are  infinitesimals  that  can 
be  used  to  model  events  that,  intuitively,  have  infinitesimally  small  probability  yet  may 
still  be  learned  or  observed. 

There  is  a  third  approach  to  this  problem,  which  uses  sequences  of  probability  mea¬ 
sures  to  represent  uncertainty.  The  most  recent  exemplar  of  this  approach,  which  I  focus 
on  here,  are  the  lexicographic  probability  systems  (LPS’s)  of  Blume,  Brandenburger,  and 
Dekel  [1991a,  1991b]  (BBD  from  now  on).  However,  the  idea  of  using  a  system  of  mea¬ 
sures  to  represent  uncertainty  actually  was  explored  as  far  back  as  the  1950s  by  Renyi 
[1956]  (see  Section  3.4).  A  lexicographic  probability  system  is  a  sequence  (po,  pi, . . .)  of 
probability  measures.  Intuitively,  the  first  measure  in  the  sequence,  po,  is  the  most  im¬ 
portant  one,  followed  by  pi,  p2,  and  so  on.  One  way  to  understand  LPS’s  is  in  terms 
of  NPS’s.  Roughly  speaking,  the  probability  assigned  to  an  event  U  by  a  sequence  such 
as  (po,  pi)  can  be  taken  to  be  po(U)  +  epi(U),  where  e  is  an  infinitesimal.  Thus,  even  if 
the  probability  of  U  according  to  po  is  0,  U  still  has  a  positive  (although  infinitesimal) 
probability  if  pi(U)  >  0. 

What  is  the  precise  relationship  between  these  approaches?  The  relationship  between 
LPS’s  and  GPS’s  has  been  considered  before.  For  example,  Hammond  [1994]  shows  that 
conditional  probability  spaces  are  equivalent  to  a  subclass  of  LPS’s  called  lexicographic 
conditional  probability  spaces  (LCPS’s)  if  the  state  space  is  finite  and  it  is  possible  to 
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condition  on  any  nonempty  set.1  As  shown  by  Spohn  [1986],  Hammond’s  result  can 
be  extended  to  arbitrary  countably  additive  Popper  spaces,  where  a  Popper  space  is  a 
conditional  probability  space  where  the  events  on  which  conditioning  is  allowed  satisfy 
certain  regularity  conditions.  As  I  show,  this  result  depends  critically  on  a  number 
of  assumptions.  In  particular,  it  does  not  work  without  the  assumption  of  countable 
additivity,  it  requires  that  we  extend  LCPS’s  appropriately  to  the  infinite  case,  and  it  is 
sensitive  to  the  choice  of  conditioning  events.  For  example,  if  we  consider  CPS’s  where 
the  conditioning  events  can  be  viewed  as  information  sets,  and  so  are  are  not  closed  under 
supersets  (this  is  essentially  the  case  considered  by  Battigalli  and  Sinischalchi  [2002]), 
then  the  result  no  longer  holds. 

Turning  to  the  relationship  between  LPS’s  and  NPS’s,  I  show  that  if  the  state  space 
is  finite,  then  LPS’s  are  in  a  sense  equivalent  to  NPS’s.  More  precisely,  say  that  two 
measures  of  uncertainty  u\  and  u2  (each  of  which  can  be  either  an  LPS  or  an  NPS)  are 
equivalent,  denoted  v\  ~  z/2,  if  they  cannot  be  distinguished  by  (real- valued)  random 
variables;  that  is,  for  all  random  variables  X  and  Y ,  EVl(X)  <  EUl(Y)  iff  EU2(X)  < 
EV2{Y)  (where  Ev{ X)  denotes  the  expected  value  of  A"  with  respect  to  v).  To  the  extent 
that  we  are  interested  in  these  representations  of  uncertainty  for  decision  making,  then 
we  should  not  try  to  distinguish  two  representations  that  are  equivalent.  I  show  that,  in 
finite  spaces,  there  is  a  straightforward  bijection  between  ^-equivalence  classes  of  LPS’s 
and  NPS’s.  This  equivalence  breaks  down  if  the  state  space  is  infinite;  in  this  case,  NPS’s 
are  strictly  more  general  than  LPS’s  (whether  or  not  countable  additivity  is  assumed). 

Finally,  I  consider  the  relationship  between  Popper  spaces  and  NPS’s,  and  show  that 
NPS’s  are  more  general.  (The  theorem  I  prove  is  a  generalization  of  one  proved  by  McGee 
[1994],  but  my  interpretation  of  it  is  quite  different;  see  Section  5.) 

These  results  give  some  useful  insight  into  independence  of  random  variables.  There 
have  been  a  number  of  alternative  notions  of  independence  considered  in  the  literature  of 
extended  probability  spaces  (i.e.,  approaches  that  deal  with  the  problem  of  conditioning 
on  sets  of  measure  0):  BBD  considered  three;  Kohlberg  and  Reny  [1997]  considered  two 
others.  It  turns  out  that  these  notions  are  perhaps  best  understood  in  the  context  of 
NPS’s;  I  describe  and  compare  them  here. 

Many  of  the  new  results  in  this  paper  involve  infinite  spaces.  Given  that  most  games 
studied  by  game  theorists  are  finite,  it  is  fair  to  ask  whether  these  results  have  any  sig¬ 
nificance  for  game  theory.  I  believe  they  do.  Even  if  the  underlying  game  is  finite,  the 
set  of  types  is  infinite.  Epistemic  characterizations  of  solution  concepts  often  make  use  of 
complete  type  spaces,  which  include  every  possible  type  of  every  player,  where  a  type  de¬ 
termines  an  (extended)  probability  over  the  strategies  and  types  of  the  other  players;  this 
must  be  an  infinite  space.  For  example,  Battigalli  and  Siniscalchi  [2002]  use  a  complete 
type  space  where  the  uncertainty  is  represented  by  cps’s  to  give  an  epistemic  characteri- 

1Despite  this  isomorphism;  it  is  not  clear  that  conditional  probability  spaces  are  equivalent  to  LPS’s. 
It  depends  on  exactly  what  we  mean  by  equivalence.  The  same  comment  applies  below  where  the  word 
“equivalent”  is  used.  See  Section  7  for  further  discussion.  I  thank  Geir  Asheim  for  bringing  this  point 
to  my  attention. 
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zation  of  extensive-form  rationalizability  and  backward  induction,  while  Brandenburger, 
Friedenberg,  and  Keisler  [2008]  use  a  complete  type  space  where  the  uncertainty  is  repre¬ 
sented  by  LPS’s  to  get  a  characterization  of  weak  dominance  in  normal-form  games.  As 
the  results  of  this  paper  show,  the  set  of  types  depends  to  some  extent  on  the  notion  of 
extended  probability  used.  Similarly,  a  number  of  characterizations  of  solution  concepts 
depend  on  independence  (see,  for  example,  [Battigalli  1996;  Kohlberg  and  Reny  1997; 
Battigalli  and  Siniscalchi  1999]).  Again,  the  results  of  this  paper  show  that  these  notions 
can  be  somewhat  sensitive  to  exactly  how  uncertainty  is  represented,  even  with  a  finite 
state  space.  While  1  do  not  present  any  new  game-theoretic  results  here,  I  believe  that 
the  characterizations  1  have  provided  may  be  useful  both  in  terms  of  defending  particular 
choices  of  representation  used  and  suggesting  new  solution  concepts. 

The  remainder  of  the  paper  is  organized  as  follows.  In  the  next  section,  1  review 
all  the  relevant  definitions  for  the  three  representations  of  uncertainty  considered  here. 
Section  3  considers  the  relationship  between  Popper  spaces  and  LPS’s.  Section  4  considers 
the  relationship  between  LPS’s  and  NPS’s.  Finally,  Section  5  considers  the  relationship 
between  Popper  spaces  and  NPS’s.  In  Section  6  1  consider  what  these  results  have  to  say 
about  independence.  1  conclude  with  some  discussion  in  Section  7. 


2  Conditional,  lexicographic,  and  nonstandard  prob¬ 
ability  spaces 

In  this  section  1  briefly  review  the  three  approaches  to  representing  likelihood  discussed 
in  the  introduction. 

2.1  Popper  spaces 

A  conditional  probability  measure  takes  pairs  U,  V  of  subsets  as  arguments;  p(V,  U)  is 
generally  written  /i(V  \  U )  to  stress  the  conditioning  aspects.  The  first  argument  comes 
from  some  algebra  T  of  subsets  of  a  space  W ;  if  W  is  infinite,  T  is  often  taken  to  be  a 
cT-algebra.  (Recall  that  an  algebra  of  subsets  of  IF  is  a  set  of  subsets  containing  W  and 
closed  under  union  and  complementation.  A  a-algebra  is  an  algebra  that  is  closed  under 
union  countable.)  The  second  argument  comes  from  a  set  T'  of  conditioning  events,  that 
is,  that  is,  events  on  which  conditioning  is  allowed.  One  natural  choice  is  to  take  T'  to 
be  T  —  0.  But  it  may  be  reasonable  to  consider  other  restrictions  on  T' .  For  example, 
Battigalli  and  Sinischalchi  [2002]  take  T'  to  consist  of  the  information  sets  in  a  game, 
since  they  are  interested  only  in  agents  who  update  their  beliefs  conditional  on  getting 
some  information.  The  question  is  what  constraints,  if  any,  should  be  placed  on  T' . 
For  most  of  this  paper,  I  focus  on  Popper  spaces  (named  after  Karl  Popper),  defined 
next,  where  the  set  T'  satisfies  four  arguably  reasonable  requirements,  but  I  occasionally 
consider  other  requirements  (see  Section  3.3). 
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Definition  2.1:  A  conditional  probability  space  (cps)  over  ( W ,  X)  is  a  tuple  ( IF,  X,  X',  p) 
such  that  X  is  an  algebra  over  W,  T'  is  a  set  of  subsets  of  W  (not  necessarily  an  algebra 
over  W)  that  does  not  contain  0,  and  p  :  X  x  X'  — >  [0,1]  satisfies  the  following  conditions: 

CPI.  p{U\U)  =  1  if  f/ef. 

CP2.  //(K  U  F2  |  V)  =  //(Cl  1 I/)  +  /x(F2  1 £7)  if  Cl  n  F2  =  0,  U  G  X',  and  Vi ,  C2  G  X. 

CP3.  //(F  |  C)  =  //(F  |  X)  x  p(X  |  £/)  if  F  C  X  C  £/,  U,X  G  X,  F  G  X. 

Note  that  it  follows  from  CPI  and  CP2  that  //(•  ]  U)  is  a  probability  measure  on  (IF,  X) 
(and,  in  particular,  that  //(0  |  U)  =  0)  for  each  U  G  X'.  A  Popper  space  over  (IF,  X) 
is  a  conditional  probability  space  (IF,  X,  X',  p)  that  satisfies  three  additional  conditions: 
(a)  X'  C  JP,  (b)  T'  is  closed  under  supersets  in  X,  in  that  if  V  e  X',  V  C  V7,  and 
C'  G  X,  then  C'  G  X',  and  (c)  if  U  G  X'  and  //(C  1 1/)  ^  0  then  V  n  U  G  X'.  If  X  is  a 

u-algebra  and  p  is  countably  additive  (that  is,  if  //(UC  I  U)  =  I  C)  if  the  V[’s 

are  pairwise  disjoint  elements  of  X  and  U  G  X'),  then  the  Popper  space  is  said  to  be 
countably  additive.  Let  Pop{W,X)  denote  the  set  of  Popper  spaces  over  (IP,  X).  If  X  is 
a  a-algebra,  I  use  a  superscript  c  to  denote  the  restriction  to  countably  additive  Popper 
spaces,  so  Popc(W,  X)  denotes  the  set  of  countably  additive  Popper  spaces  over  (IP,  X). 
The  probability  measure  p  in  a  Popper  space  is  called  a  Popper  measure.  I 

The  last  regularity  condition  on  T'  required  in  a  Popper  space  corresponds  to  the  obser¬ 
vation  that  for  an  unconditional  probability  measure  p.  if  p(V  \  U)  ^  0  then  p(VP\U)  ^  0, 
so  conditioning  on  V PiU  should  be  defined.  Note  that,  since  this  regularity  condition  de¬ 
pends  on  the  Popper  measure,  it  may  well  be  the  case  that  (IP,  X,  X',  p)  and  (IP,  X,  X',  u) 
are  both  cps’s  over  (IP,  X),  but  only  the  former  is  a  Popper  space  over  (IP,  X). 

Popper  [1934,  1968]  and  de  Finetti  [1936]  were  the  first  to  formally  consider  con¬ 
ditional  probability  as  the  basic  notion,  although  as  Renyi  [1964]  points  out,  the  idea 
of  taking  conditional  probability  as  primitive  seems  to  go  back  as  far  as  Keynes  [1921], 
CPI-3  are  essentially  due  to  Renyi  [1955].  Van  Fraassen  [1976]  defined  what  I  have  called 
Popper  measures;  he  called  them  Popper  functions,  reserving  the  name  Popper  measure 
for  what  I  am  calling  a  countably  additive  Popper  measure.  Starting  from  the  work  of  de 
Finetti,  there  has  been  a  general  study  of  coherent  conditional  probabilities.  A  coherent 
conditional  probability  is  essentially  a  cps  that  is  not  necessarily  a  Popper  space,  since  it 
is  defined  on  a  set  X  x  X'  where  X'  does  not  have  to  be  a  subset  of  X);  see,  for  example, 
[Coletti  and  Scozzafava  2002]  and  the  references  therein.  Hammond  [1994]  discusses  the 
use  of  conditional  probability  spaces  in  philosophy  and  game  theory,  and  provides  an 
extensive  list  of  references. 

2.2  Lexicographic  probability  spaces 

Definition  2.2:  A  lexicographic  probability  space  (LPS)  (of  length  a)  over  (IF,  X)  is  a 
tuple  (IF,  X,  p)  where,  as  before,  IF  is  a  set  of  possible  worlds  and  X  is  an  algebra  over 
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W,  and  p  is  a  sequence  of  finitely  additive  probability  measures  on  ( W,  T)  indexed  by 
ordinals  <  a.  (Technically,  /2  is  a  function  from  the  ordinals  less  than  a  to  probability 
measures  on  (W,  T).)  I  typically  write  p  as  . . .)  or  as  {up  :  (3  <  a).  If  T  is  a 

a-algebra  and  each  of  the  probability  measures  in  p  is  countably  additive,  then  /2  is  a 
countably  additive  LPS.  Let  LPS(W,  T)  denote  the  set  of  LPS’s  over  (W,  IF).  Again,  if 
F  is  a  a- algebra,  a  superscript  c  is  used  to  denote  countable  additivity,  so  LPSC(W,  F) 
denote  the  set  of  countably  additive  LPS’s  over  (IT,  F).  When  (IT,  F)  are  understood,  1 
often  refer  to  p  as  the  LPS.  1  write  p(U)  >  0  if  Hp{U)  >  0  for  some  [3.  I 

There  is  a  sense  in  which  LPS(W,  F)  can  capture  a  richer  set  of  preferences  than 
Pop(W,  F),  even  if  we  restrict  to  finite  spaces  W  (so  that  countable  additivity  is  not 
an  issue).  For  example,  suppose  that  W  =  {w\,w2},  fJ>o(wi)  =  n o(w2)  =  1/2,  and 
=  1.  The  LPS  p  =  (no,  fii)  can  be  thought  of  describing  the  situation  where  w\ 
is  very  slightly  more  likely  than  w2 ■  Thus,  for  example,  if  Xt  is  a  bet  that  pays  off  1  in 
state  Wi  and  0  in  state  w3_i,  then  according  to  p,  Ad  should  be  (slightly)  preferred  to 
X2,  but  for  all  r  >  1,  rX2  is  preferred  to  Ad-  There  is  no  CPS  on  {wi,w2}  that  leads  to 
these  preferences 

Note  that,  in  this  example,  the  support  of  /j2  is  a  subset  of  that  of  /i  1 .  To  obtain  a 
bijection  between  LPS’s  and  CPS’s,  we  cannot  allow  much  overlap  between  the  supports 
of  the  measures  that  make  an  LPS.  What  counts  as  “much  overlap”  turns  out  to  be  a 
somewhat  subtle.  One  way  to  formalize  it  was  proposed  by  BBD.  They  defined  a  lexico¬ 
graphic  conditional  probability  space  (LCPS)  to  be  an  LPS  such  that,  roughly  speaking, 
the  probability  measures  in  the  sequence  have  disjoint  supports;  more  precisely,  there 
exist  sets  Up  e  F  such  that  Pp(Up)  =  1  and  the  sets  Up  are  pairwise  disjoint  for  (3  <  a. 
One  motivation  for  considering  disjoint  sets  is  to  consider  an  agent  who  has  a  sequence 
of  hypotheses  (ho,  hi, . . .)  regarding  how  the  world  works.  If  the  primary  hypothesis  ho 
is  discarded,  then  the  agent  judges  events  according  to  hy  if  hi  is  discarded,  then  the 
agent  uses  h2,  and  so  on.  Associated  with  hypothesis  hp  is  the  probability  measure  pp. 
What  would  cause  hp  to  be  discarded  is  observing  an  event  U  such  that  pp{U)  =  0.  The 
set  Up  is  the  support  of  the  hypothesis  hp.  In  some  cases,  it  seems  reasonable  to  think 
of  the  supports  of  these  hypotheses  as  disjoint.  This  leads  to  LCPS’s. 

BBD  considered  only  finite  spaces.  When  we  move  to  infinite  spaces,  requiring  dis¬ 
jointness  of  the  supports  of  hypotheses  may  be  too  strong.  Brandenburger,  Friedenberg, 
and  Keislcr  [2008]  consider  finite-length  LPS’s  p  that  satisfy  the  property  that  there  exist 
sets  Up  (not  necessarily  disjoint)  such  that  fip(Up )  =  1  and  pp(U^)  =  0  for  7  7^  (3.  Call 
such  an  LPS  an  MSLPS  (for  mutually  singular  LPS).  Let  a  structured  LPS  (SLPS)  be  an 
LPS  p  such  that  there  exist  sets  Up  G  T  such  that  l-ip(Ug)  =  1  and  Hp(Uj)  =  0  for  7  >  (3. 
Thus,  in  an  SLPS,  later  hypotheses  are  given  probability  0  according  to  the  probabil¬ 
ity  measure  induced  by  earlier  hypotheses,  but  earlier  hypotheses  do  not  necessarily  get 
probability  0  according  the  later  hypotheses.  (Spohn  [1986]  also  considered  SLPS’s;  he 
called  them  dimensionally  well-ordered  families  of  probability  measures .)  Clearly  every 
LCPS  is  an  MSLPS,  and  every  MSLPS  is  an  SLPS.  If  a  is  countable  and  we  require 
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countable  additivity  (or  if  a  is  finite)  then  the  notions  are  easily  seen  to  coincide.  Given 
an  SLPS  jl  with  associated  sets  Up,  (3  <  a,  define  Up  =  Up  —  (U 7>/g[/7).  The  sets  Up  are 
clearly  pairwise  disjoint  elements  of  T ,  and  / 'J>p(Up )  =  1.  However,  in  general,  LCPS’s  are 
a  strict  subset  of  MSLPS’s,  and  MSLPS’s  are  a  strict  subsets  of  SLPS’s,  as  the  following 
two  examples  show. 

Example  2.3:  Consider  a  well-ordering  of  the  interval  [0,1],  that  is,  a  bijection  from 
[0, 1]  to  an  initial  segment  of  the  ordinals.  Suppose  that  this  initial  segment  of  the 
ordinals  has  length  a.  Let  ([0, 1],  T ,  jl)  be  an  LPS  of  length  a  where  T  consists  of  the 
Borel  subsets  of  [0, 1].  Let  /i0  be  the  standard  Borel  measure  on  [0, 1],  and  let  \ip  be  the 
measure  that  gives  probability  1  to  rp,  the  /3th  real  in  the  well-ordering.  This  clearly 
gives  an  SLPS,  since  we  can  take  Uq  =  [0, 1]  and  Up  =  {rp}  for  0  <  (3  <  a\  note  that 
ha(Up)  =  0  for  (3  >  a.  However,  this  SLPS  is  not  equivalent  to  any  MSLPS  (and  hence 
not  to  any  LCPS);  there  is  no  set  U'Q  such  that  Ho(Uq)  =  1  and  U'Q  is  disjoint  from  rp  for 
all  f3  with  0  <  (3  <  a.  I 

Example  2.4:  Suppose  that  W  =  [0, 1]  x  [0, 1].  Again,  consider  a  well-ordering  on  [0, 1]. 
Using  the  notation  of  Example  2.3,  define  U0tp  =  rp  x  [0, 1]  and  U^p  =  [0, 1]  x  {rp}.  Define 
to  be  the  Borel  measure  on  Uhp.  Consider  the  LPS  (/i0)o,  Ak>,i,  •  •  • ,  hi,o?  AH,i,  •  •  •)• 
Clearly  this  is  an  MSLPS,  but  not  an  LCPS.  I 

The  difference  between  LCPS’s,  MSLPS’s,  and  SLPS’s  does  not  arise  in  the  work 
of  BBD,  since  they  consider  only  finite  sequences  of  measures.  The  restriction  to  finite 
sequences,  in  turn,  is  due  to  their  restriction  to  finite  sets  W  of  possible  worlds.  Clearly, 
if  W  is  finite,  then  all  LCPS’s  over  ID  must  have  length  <  |W|,  since  the  measures  in  an 
LCPS  have  disjoint  supports.  Here  it  will  play  a  more  significant  role. 

We  can  put  an  obvious  lexicographic  order  <l  on  sequences  {xq,x\,  . . .)  of  numbers 
in  [0, 1]  of  length  a:  {xq,x\, . . .)  <l  (yo,Vi,  ■  ■  •)  if  there  exists  (3  <  a  such  that  xp  <  yp 
and  x1  =  y7  for  all  7  <  (3.  That  is,  we  compare  two  sequences  by  comparing  their 
components  at  the  first  place  they  differ.  (Even  if  a  is  infinite,  because  we  are  dealing 
with  ordinals,  there  will  be  a  least  ordinal  at  which  the  sequences  differ  if  they  differ  at 
all.)  This  lexicographic  order  will  be  used  to  define  decision  rules. 

BBD  define  conditioning  in  LPS’s  as  follows.  Given  jl  and  U  G  T  such  that  jl(U)  >  0, 
let  jl\U  =  (fJ>k0(- 1  U),  ykA'  \  U ), . . .),  where  (fc0,  k\, . . .)  is  the  subsequence  of  all  indices  for 
which  the  probability  of  U  is  positive.  Formally,  ko  =  minjfc  :  fik{U)  >  0}  and  for  an 
arbitrary  ordinal  f3  >  0,  if  y,k  has  been  defined  for  all  7  <  (3  and  there  exists  a  measure 
Us  in  h  such  that  Hs(U)  >  0  and  5  >  /c7  for  all  7  <  (3,  then  kp  =  min {(5  :  Hs(U)  >  0,  S  > 
fc7  for  all  7  <  f3}.  Note  that  jl\U  is  undefined  if  jl{U)  =  0. 

2.3  Nonstandard  probability  spaces 

It  is  well  known  that  there  exist  non- Archimedean  fields — fields  that  include  the  real 
numbers  as  a  sub  field  but  also  have  infinitesimals,  numbers  that  are  positive  but  still 
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less  than  any  positive  real  number.  The  smallest  such  non- Archimedean  held,  commonly 
denoted  M(e),  is  the  smallest  held  generated  by  adding  to  the  reals  a  single  infinitesimal 
e.2  We  can  further  restrict  to  non- Archimedean  helds  that  are  elementary  extensions 
of  the  standard  reals:  they  agree  with  the  standard  reals  on  all  properties  that  can  be 
expressed  in  a  hrst-order  language  with  a  predicate  N  representing  the  natural  numbers. 
For  most  of  this  paper,  1  use  only  the  following  properties  of  non- Archimedean  helds: 

1.  If  JR*  is  a  non- Archimedean  held,  then  for  all  b  E  JR*  such  that  —r<b<  r  for 
some  standard  real  r  >  0,  there  is  a  unique  closest  real  number  a  such  that  \a  —  b\  is 
an  infinitesimal.  (Formally,  a  is  the  inf  of  the  set  of  real  numbers  that  are  at  least 
as  large  as  b.)  Let  st{b)  denote  the  closest  standard  real  to  6;  st{b )  is  sometimes 
read  “the  standard  part  of  6” . 

2.  If  st(e/e')  =  0,  then  ae  <  e'  for  all  positive  standard  real  numbers  a.  (If  ae  were 
greater  than  e',  then  e/e'  would  be  greater  than  1/a,  contradicting  the  assumption 
that  st  ( e/e ')  =  0.) 

Given  a  non- Archimedean  held  1R* ,  a  nonstandard  probability  space  (NPS)  over  ( IF,  F) 
(with  range  1R* )  is  a  tuple  (FF,  F,  p),  where  IF  is  a  set  of  possible  worlds,  F  is  an  alge¬ 
bra  of  subsets  of  FF,  and  p  assigns  to  sets  in  F  a  nonnegative  element  of  JR*  such  that 
p(W)  =  1  and  p(U  U  V)  —  p(U)  +  p(V)  if  U  and  V  are  disjoint.3 

If  FF  is  infinite,  we  may  also  require  that  F  be  a  u-algebra  and  that  p  be  countably 
additive.  (There  are  some  subtleties  involved  with  countable  additivity  in  nonstandard 
probability  spaces;  see  Section  4.3.) 


3  Relating  Popper  Spaces  to  (S)LPS’s 

In  this  section,  I  consider  a  mapping  F$^p  from  SLPS’s  over  (FF,  F)  to  Popper  spaces 
over  (FF,  F),  for  each  hxed  FF  and  F,  and  show  that,  in  many  cases  of  interest,  Fs^p 
is  a  Injection.  Given  an  SLPS  (W,F,p)  of  length  cc,  consider  the  cps  (FF,  F,  F',  p)  such 
that  F'  =  Up<a{V  E  JF  :  Pp(V)  >  0}.  For  V  E  F\  let  (3y  be  the  smallest  index 
such  fiQv(V)  >  0.  Define  p(U  \  V )  =  H/3V(U  V).  1  leave  it  to  the  reader  to  check  that 
(W,  F ,  F' ,  p)  is  a  Popper  space. 

There  are  many  Injections  between  two  spaces.  Why  is  Fs^p  of  interest?  Suppose 
that  Fs^p(W,  F jl)  =  ( IF7,  F ,  F' ,  p).  It  is  easy  to  check  that  the  following  two  important 
properties  hold: 

1.  F'  consists  precisely  of  those  events  for  which  conditioning  in  the  LPS  is  defined; 
that  is,  F'  =  {U  :  p{U)  >  0}. 

2The  construction  of  JR(e)  apparently  goes  back  to  Robinson  [1973].  It  is  reviewed  by  Hammond 
[1994,  1999]  and  Wilson  [1995]  (who  calls  M(e)  the  extended  reals). 

3Note  that,  unlike  Hammond  [1994,  1999],  I  do  not  restrict  the  range  of  probability  measures  to 
consist  of  ratios  of  polynomials  in  e  with  nonnegative  coefficients. 


2.  For  U  G  F' ,  n{- 1  U)  =  n'( ■  \  U ),  where  //  is  the  hrst  probability  measure  in  the 
sequence  fi\U .  That  is,  the  Popper  measure  agrees  with  the  most  significant  prob¬ 
ability  measure  in  the  conditional  LPS  given  U.  Given  that  an  LPS  assigns  to  an 
event  U  a  sequence  of  numbers  and  a  Popper  measure  assigns  to  U  just  a  single 
number,  this  is  clearly  the  best  single  number  to  take. 

It  is  clear  that  these  two  properties  in  fact  characterize  Fs^p.  Thus,  FS->p  preserves  the 
events  on  which  conditioning  is  possible  and  the  most  significant  term  in  the  lexicographic 
probability. 

3.1  The  finite  case 

It  is  useful  to  separate  the  analysis  of  Fs^p  into  two  cases,  depending  on  whether  or  not 
the  state  space  is  finite.  I  consider  the  finite  case  first. 

BBD  claim  without  proof  that  Fg^p  is  a  bijection  from  LCPS’s  to  conditional  prob¬ 
ability  spaces.  They  work  in  finite  spaces  W  (so  that  LCPS’s  are  equivalent  to  SLPS’s) 
and  restrict  attention  to  LPS’s  where  F  =  2W  and  F’  =  2"  —  {0}  (so  that  conditioning 
is  defined  for  all  nonempty  sets).  Since  F’  =  2"  —  {0},  the  cps’s  they  consider  are  all 
Popper  spaces.  Hammond  [1994]  provides  a  careful  proof  of  this  result,  under  the  restric¬ 
tions  considered  by  BBD.  I  generalize  Hammond’s  result  by  considering  finite  Popper 
spaces  with  arbitrary  conditioning  events.  No  new  conceptual  issues  arise  in  doing  this 
extension;  I  include  it  here  only  to  be  able  to  contrast  it  with  the  other  results. 

Let  SLPS (W,  F)  denote  the  set  of  LPS’s  over  (IF,  F);  let  SLPS{W,  F ,  F')  denote  the 
set  of  LPS’s  (IF,  F,  /I)  such  that  fi(U)  >  0  for  all  U  G  F'  (i.e.,  / Jip{U )  >  0  for  some  f3 );  as 
usual,  I  use  a  superscript  c  to  denote  countable  additivity,  so,  for  example,  SLPSC(W,  F) 
denotes  the  set  of  countably  additive  SLPS’s  over  (IF,  F).  Let  Pop(W,  F,  F')  denote  the 
set  of  Popper  spaces  of  the  form  (IF,  F,  Fr)  and  let  Popc(W,  F,  F')  denote  the  set  of 
Popper  spaces  of  the  form  (W,F,F',p)  where  /i  is  countably  additive. 

Theorem  3.1:  IfW  is  finite,  thenFs^p  is  a  bijection  from  SLPS  (IF,  F,  F')  to  Pop  (IF,  F,  F'). 

Proof:  It  is  immediate  from  the  definition  that  if  (W,F,fi)  G  SLPS(W,  F,  F),  then 
Fs-+p(W,F,p)  G  Pop{W,  F ,  F').  It  is  also  straightforward  to  show  that  Fs^p  is  an 
injection  (see  the  appendix  for  details).  The  work  comes  in  showing  that  Fg^p  is  a 
surjection  (or,  equivalently,  in  constructing  an  inverse  to  Fg^p)-  I  sketch  the  main  ideas 
of  the  argument  here,  leaving  details  to  the  appendix. 

Given  //  G  Pop ( IF,  F ,  Fr),  the  idea  is  to  choose  k  <  \W\  and  k  disjoint  sets  Uq,  . . . ,  £4  G 
F'  appropriately  such  that  pj  =  p\  Uj  for  j  =  0, .. . ,  k  (i.e.,  Pj(V)  =  p(V  \  Ufi)  and 
Fs->p(W,  F,  p)  =  i-i ■  Since  the  sets  U0, . . . ,  14  are  disjoint,  jl  must  be  an  SLPS.  The 
difficulty  lies  in  choosing  U0, ...  ,Uk  so  that  p(U)  >  0  iff  U  G  F' .  This  is  done  as  follows. 

Let  Uq  be  the  smallest  set  U  G  F  such  that  p{U)  =  1.  Since  IF  is  finite,  there  is  such 
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a  smallest  set;  it  is  simply  the  intersection  of  all  sets  U  such  that  fi(U  \  W)  =  1.  Since 
fj,(U0  |  W)  >  0,  it  follows  that  Uq  G  T' .  If  Uq  T' .  then  (because  IF'  is  closed  under 
supersets  in  F),  no  subset  of  Uq  is  in  F' .  If  Uq  G  F' ,  let  U\  be  the  smallest  set  in  F  such 
that  |  Uq)  —  1.  Note  that  U\  C  Uq  and  that  U\  G  F' .  Continuing  in  this  way,  it  is 
clear  that  there  exists  a  k  >  0  and  a  sequence  of  pairwise  disjoint  sets  Uq,  Ui, . . . ,  £4  such 
that  (1)  Ui  G  F'  for  i  =  0, . . . ,  k,  (2)  for  i  <  k,  Uq  U  . . .  U  Ui  G  F'  and  Ui+i  is  the  smallest 
subset  of  F  such  that  n(Ui+i  \  U0  U  . . .  U  £/,)  =  1,  and  (3)  Uq  U  . . .  U  £4  ^  F' .  Condition 
(2)  guarantees  that  Uj+\  is  a  subset  of  Uq  U  . . .  U  Ut,  so  the  Ui  s  are  pairwise  disjoint. 
Dehne  the  LPS  jl  =  (/xi , . . .  ,/ifc)  by  taking  /q( V)  =  fj,(V  \  U).  Clearly  the  support  of  /it 
is  Ui,  so  this  is  an  LCPS  (and  hence  an  SLPS).  I 


Corollary  3.2:  IfW  is  finite,  then  Fs^p  is  a  bijection  from  SLPS(hP,  T)  to  Pop (W,F). 

3.2  The  infinite  case 

The  case  where  the  state  space  W  is  infinite  is  not  considered  by  either  BBD  or  Hammond. 
It  presents  some  interesting  subtleties. 

It  is  easy  to  see  that  Fs^p  is  an  injection  from  SLPS’s  to  Popper  spaces.  However, 
as  the  following  two  examples  show,  if  we  do  not  require  countable  additivity,  then  it  is 
not  a  bijection. 

Example  3.3:  (This  example  is  essentially  due  to  Robert  Stalnaker  [private  commu¬ 
nication,  2000].)  Let  W  =  IV,  the  natural  numbers,  let  T  consist  of  the  finite  and 
cohnite  subsets  of  IV  (recall  that  a  cohnite  set  is  the  complement  of  a  finite  set),  and 
let  T'  =  T  —  {0}.  If  U  is  cohnite,  take  /U1(P  |  U)  to  be  1  if  V  is  cohnite  and  0  if  V  is 
finite.  If  U  is  finite,  dehne  /U1(P  |  U)  —  \V  D  U\/\U\.  I  leave  it  to  the  reader  to  check 
that  (IV,  T,  T' ,  /i1)  is  a  Popper  space.  Note  that  fi1  is  not  countably  additive  (since 
/u1({i}  |  IV)  =  0  for  all  i,  although  /u1(IV  |  IV)  =  1).  Suppose  that  there  were  some  LPS 
(IV,  T ,  fi)  which  was  mapped  by  Fg^p  to  this  Popper  space.  Then  it  is  easy  to  check  that 
if  Hi  is  the  hrst  measure  in  jl  such  that  Hi(U)  >  0  for  some  finite  set  U,  then  (m(U')  >  0 
for  all  nonempty  finite  sets  U' .  To  see  this,  note  that  for  any  nonempty  finite  set  U' , 
since  Hi{ U)  >  0,  it  follows  that  Hi(U  U  U')  >  0.  Since  U  U  U'  is  finite,  it  must  be  the 
case  that  Hi  is  Hie  first  measure  in  jl  such  that  /q( U  U  U')  >  0.  Thus,  by  definition, 
Hl{U'  |  U  U  U')  =  Hi{U'  \UUU').  Since  n\U'  \U\JU')>0,  it  follows  that  Hi{U')  >  0. 
Thus,  Hi{U')  >  0  for  all  nonempty  finite  sets  U' . 

It  is  also  easy  to  see  that  Hi( U)  must  be  proportional  to  \U\  for  all  finite  sets  U. 
To  show  this,  it  clearly  suffices  to  show  that  Hi(n)  =  Ah(0)  for  all  n  G  IV.  But  this  is 
immediate  from  the  observation  that 

M,({0}  I  {o.n})  =  P({0}  I  {0,  n})  =  |{0}|/|{0,  n}\  =  I 
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But  there  is  no  probability  measure  /i,  on  the  natural  numbers  such  that  Pi(n)  =  /Xj( 0)  >  0 
for  all  n  >  0.  For  if  /i*( 0)  >  1/iV,  then  Pi({0, . . . ,  N  —  1})  >  1,  a  contradiction.  (See 
Example  4.8  for  further  discussion  of  this  setup.)  I 

Example  3.4:  Again,  let  W  =  IN,  let  F  consist  of  the  finite  and  cohnite  subsets  of  IN, 
and  let  F'  =  JF— {0}.  As  with  p1,  if  U  is  cohnite,  take  p2(V  \  U )  to  be  1  if  V  is  cohnite  and 
0  if  V  is  finite.  However,  now,  if  U  is  finite,  define  p2(V  \  U)  =  1  if  max(H  C\U)  —  max  U, 
and  p2(V\U)  =  0  otherwise.  Intuitively,  if  n  >  n' ,  then  n  is  inhnitely  more  probable 
than  n'  according  to  p2.  Again,  I  leave  it  to  the  reader  to  check  that  (IV,  F,  F' ,  p2)  is  a 
Popper  space.  Suppose  there  were  some  LPS  (IV,  F,  jl)  which  was  mapped  by  Fs^p  to 
this  Popper  space.  Then  it  is  easy  to  check  that  if  pn  is  the  hrst  measure  in  jl  such  that 
Hn({n})  >  0,  then  pn  comes  before  pn>  in  jl  if  n  >  n' .  However,  since  jl  is  well-founded, 
this  is  impossible.  I 

As  the  following  theorem,  originally  proved  by  Spohn  [1986],  shows,  there  are  no 
such  counterexamples  if  we  restrict  to  countably  additive  SLPS’s  and  countably  additive 
Popper  spaces. 

Theorem  3.5:  [Spohn  1986]  For  all  W,  the  map  FS^P  is  a  bijection  from  SLPSC(1T,  T ,  T') 
to  Popc(Hd  T ,  T'). 

Proof:  Again,  the  difficulty  comes  in  showing  that  F$->p  is  onto.  Given  a  Popper  space 
(W,  T ,  T' ,  p),  I  again  construct  sets  U0,  U\, . . .  and  an  LPS  jl  such  that  Ppiy)  =  p(V  \  Up), 
and  show  that  Fs-,p{W,lF,jl)  =  {W,  IF,  IF',  p).  However,  now  a  completely  different 
construction  is  required;  the  earlier  inductive  construction  of  the  sequence  Uq,  . .  - ,  U k  no 
longer  works.  The  problem  already  arises  in  the  construction  of  Uq.  There  may  no  longer 
be  a  smallest  set  Uq  such  that  p(Uf)  =  1.  Consider,  for  example,  the  interval  [0, 1]  with 
Borel  measure.  There  is  clearly  no  smallest  subset  U  of  [0, 1]  such  that  p{U)  =  1.  The 
details  can  be  found  in  the  appendix.  | 


Corollary  3.6:  For  all  W ,  the  map  FS_P  is  a  bijection  from  SLPSc(kF,  F)  to  Pop C(W,  IF). 

It  is  important  in  Corollary  3.6  that  we  consider  SLPS’s  and  not  MSLPS’s  or  LCPS’s. 
Fs^p  is  in  fact  not  a  bijection  from  MSLPS’s  or  LCPS’s  to  Popper  spaces. 

Example  3.7:  Consider  the  Popper  space  ([0, 1],  F,  F' ,  p)  which  is  the  image  under 
Fs^p  of  the  SLPS  constructed  in  Example  2.3.  It  is  easy  to  see  that  this  Popper  space 
cannot  be  the  image  under  Fs^p  of  some  MSPLS  (and  hence  not  of  some  LCPS  either). 
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3.3  Treelike  CPS’s 


One  of  the  requirements  in  a  Popper  space  is  that  T'  be  closed  under  supersets  in  T . 
If  we  think  of  T'  as  consisting  of  all  sets  on  which  conditioning  is  possible,  this  makes 
sense;  if  we  can  condition  on  a  set  U,  we  should  be  able  to  consider  on  a  superset  V  of 
U .  But  if  we  think  of  T'  as  representing  all  the  possible  evidence  that  can  be  obtained 
(and  thus,  the  set  of  events  on  which  an  agent  must  be  be  able  to  condition,  so  as  to 
update  her  beliefs),  there  is  no  reason  that  T'  should  be  closed  under  supersets;  nor,  for 
that  matter,  is  it  necessarily  the  case  that  if  U  G  T'  and  fi(V  \  U)  ^  0,  then  V  fl  U  G  T' . 
In  general,  a  cps  where  T’  does  not  have  these  properties  cannot  be  represented  by  an 
LPS,  as  the  following  example  shows. 

Example  3.8:  Let  W  =  {wi,w2,w3,W4},  let  T  consist  of  all  subsets  of  W,  and  let  T' 
consist  of  all  the  2-element  subsets  of  W.  Clearly  T'  is  not  closed  under  supersets.  Define 
/i  on  T  x  T'  such  that  [i[w\  |  =  ^(uq  |  {-u^uq})  =  1/3,  and  [i(w\  |  {uqjUq})  = 

yu(uq  |  {w3,  icq})  =  1/2,  and  CPI  and  CP2  hold.  This  is  easily  seen  to  determine  [i. 
Moreover,  jjl  vaciously  satisfies  CP3,  since  there  do  not  exist  distinct  sets  U  and  X  in  T' 
such  that  U  C  X.  It  is  easy  to  show  that  there  is  no  unconditional  probability  /jl*  on  W 
such  that  /i*(f/  |  V)  =  fi(U  \  V )  for  all  pairs  (U,  V)  G  T  x  T'  such  that  /i*(H)  >  0  (where, 
for  yU*,  the  conditional  probability  is  defined  in  the  standard  way).4  It  easily  follows  that 
there  is  no  LPS  /2  such  that  jl{U  \  V)  =  /i(C  |  V)  for  all  (U,  V)  G  T  x  T'  (since  otherwise 
yU0  would  agree  with  yU  on  all  pairs  (U,  V)  G  T  x  T'  such  that  yu(C)  >  0).  Had  T'  been 
closed  under  supersets,  it  would  have  included  W.  It  is  easy  to  see  that  it  is  impossible 
to  extend  yU  to  X  x  [T'  U  {W})  so  that  CP3  holds.  I 

In  the  game-theory  literature,  Battigalli  and  Siniscalchi  [2002]  use  conditional  proba¬ 
bility  measures  to  model  players’  beliefs  about  other  players’  strategies  in  extensive-form 
games  where  agents  have  perfect  recall.  The  conditioning  events  are  essentially  informa¬ 
tion  sets;  which  can  be  thought  of  as  representing  the  possible  evidence  that  an  agent  can 
obtain  in  a  game.  Thus,  the  cps’s  they  consider  are  not  necessarily  Popper  spaces,  for  the 
reasons  described  above.  Nevertheless,  the  conditioning  events  considered  by  Battigalli 
and  Sinischalchi  satisfy  certain  properties  that  prevent  an  analogue  of  Example  3.8  from 
holding.  I  now  make  this  precise. 

Formally,  I  assume  that  there  is  a  one-to-one  correspondence  between  the  sets  in 
T'  and  the  information  sets  of  some  fixed  player  i.  For  each  set  U  G  there  is  a 
unique  information  set  Ijj  for  player  i  such  that  U  consists  of  all  the  strategy  profiles 
that  reach  hr-  With  this  identification,  it  is  immediate  that  we  can  organize  the  sets 
in  T'  into  a  forest  (i.e.,  a  collection  of  trees),  with  the  same  “reachability”  structure  as 
that  of  the  information  sets  in  the  game  tree.  The  topmost  sets  in  the  forest  are  the 
ones  corresponding  to  the  topmost  information  sets  for  player  i  in  the  game  tree.  There 

4This  example  is  closely  related  to  examples  of  conditional  probabilities  for  which  there  is  no  common 
prior;  see,  for  example,  [Halpern  2002,  Example  2.2]. 
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may  be  several  such  topmost  information  sets  if  nature  or  some  player  j  other  than  i 
makes  the  first  move  in  the  game.  (That  is  why  we  have  a  forest,  rather  than  a  tree.) 
The  immediate  successors  of  a  set  U  are  the  sets  of  strategy  profiles  corresponding  to 
information  sets  for  player  i  reached  immediately  after  Ip .  Because  agents  have  perfect 
recall,  the  conditioning  events  F'  have  the  following  properties: 

Tl.  F'  is  countable. 

T2.  The  elements  of  F'  can  be  organized  as  a  forest  (i.e.,  a  collection  of  trees)  where, 
for  each  U  G  F\  if  there  is  an  edge  from  U  to  some  U'  G  Fr,  then  U'  C  U,  all 
the  immediate  successors  of  U  are  disjoint,  and  U  is  the  union  of  its  immediate 
successors. 

T3.  The  topmost  nodes  in  each  tree  of  the  forest  form  a  partition  of  W . 

Say  that  a  set  F'  is  treelike  if  it  satisfies  Tl-3.  It  follows  from  T2  and  T3  that,  for  any 
sets  U  and  U'  in  a  treelike  set  F' ,  either  U  C  U'  (if  U  is  a  descendant  of  U'  in  some  tree), 
U'CU  (if  U'  is  a  descendant  of  U),  or  U  and  U'  are  disjoint  (if  neither  is  a  descendant 
of  the  other).  If  F'  is  treelike,  let  TC(W,  F,  F')  consist  of  all  countably  additive  cps’s 
defined  on  F  x  F' .  I  abuse  notation  in  the  next  result,  viewing  F$^p  as  a  mapping  from 
SLPSC{W,F,F')  to  TC{W,T ,Tr). 

Proposition  3.9:  The  map  Fs^p  is  a  surjection  from  SLPSC(IF,  F,  Fr)  onto  TC(W,  F,  F'). 

Since  F'  is  countable,  every  SLPS  in  SLPSC{W ,  F ,  F')  must  have  at  most  countable 
length.  Thus,  there  is  no  distinction  between  SLPS’s,  LCPS’s,  and  MSPLS’s  in  this 
case.  (Indeed,  in  the  proof  of  Proposition  3.9,  the  LPS  constructed  to  demonstrate  the 
surjection  is  an  LCPS.)  Note  that  we  cannot  hope  to  get  a  bijection  here,  even  if  W  is 
finite.  For  example,  suppose  that  W  =  {wi,w2},  F  =  2W,  and  F'  =  {{wh},  F' 

is  clearly  treelike,  and  there  is  a  unique  cps  p  on  {W,F.F').  Fs^p  maps  every  SLPS  in 
SLPS(W,F,  F')  to  /i,  but  is  clearly  not  a  bijection.  (This  example  also  shows  that  we 
do  not  get  a  bijection  by  considering  MSLPS’s  or  LCPS’s  either.) 

3.4  Related  Work 

It  is  interesting  to  contrast  these  results  to  those  of  Renyi  [1956]  and  van  Fraassen  [1976]. 
Renyi  considers  what  he  calls  dimensionally  ordered  systems.  A  dimensionally  ordered 
system  over  (IF,  F)  has  the  form  (IF,  F,  F' ,  {/r*  :  i  G  /}),  where  F  is  an  algebra  of  subsets 
of  IF,  F'  is  a  subset  of  F  closed  under  finite  unions  (but  not  necessarily  closed  under 
supersets  in  F),  I  is  a  totally  ordered  set  (but  not  necessarily  well-founded,  so  it  may 
not,  for  example,  have  a  first  element)  and  /p  is  a  measure  on  (IF,  F)  (not  necessarily  a 
probability  measure)  such  that 
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•  for  each  U  G  F' ,  there  is  some  i  G  /  such  that  0  <  <  oo  (note  that  the 

measure  of  a  set  may,  in  general,  be  oo), 

•  if  ju-i(U)  <  oo  and  j  <  i,  then  fij(U)  =  0. 

Note  that  it  follows  from  these  conditions  that  for  each  U  G  F\  there  is  exactly  one  i  G  / 
such  that  0  <  fii(U)  <  oo. 

There  is  an  obvious  analogue  of  the  map  F$^p  mapping  dimensionally  ordered  sys¬ 
tems  to  cps’s.  Namely,  let  FD^C  maP  the  dimensionally  ordered  system  (W,  F ,  F1,  {//*  : 
i  G  /})  to  the  cps  ( IT,  F,  F\  fi),  where  n(V  \  U )  =  /q( V  \  U ),  where  i  is  the  unique  element 
of  /  such  that  0  <  pfU)  <  oo.  Renyi  shows  that  Fp^c  is  a  bijection  from  dimension- 
ally  ordered  systems  to  cps’s  where  the  set  F'  is  closed  under  finite  unions.  (Csaszar 
[1955]  extends  this  result  to  cases  where  the  set  F'  is  not  necessarily  closed  under  finite 
unions.)  Renyi  assumes  that  all  measures  involved  are  countably  additive  and  that  F 
is  a  u-algebra,  but  these  are  inessential  assumptions.  That  is,  his  proof  goes  through 
without  change  if  F  is  an  algebra  and  the  measures  are  additive;  all  that  happens  is  that 
the  resulting  conditional  probability  measure  is  additive  rather  than  u-additive. 

It  is  critical  in  Renyi’s  framework  that  the  pfs  are  arbitrary  measures,  and  not  just 
probability  measures.  His  result  does  not  hold  if  the  fifs  are  required  to  be  probability 
measures.  In  the  case  of  finitely  additive  measures,  the  Popper  space  constructed  in 
Example  3.3  already  shows  why.  It  corresponds  to  a  dimensionally  ordered  space  (ni,n2) 
where  fi i(U )  is  1  if  U  is  cohnite  and  0  if  U  is  finite  and  fi2( U )  =  \U\  (i.e.,  the  measure  of 
a  set  is  its  cardinality).  It  cannot  be  captured  by  a  dimensionally  ordered  space  where 
all  the  elements  are  probability  measures,  for  the  same  reason  that  it  is  not  the  image 
of  an  SLPS  under  Fs^p.  (Renyi  [1956]  actually  provides  a  general  characterization  of 
when  the  /q’s  can  be  taken  to  be  (countably  additive)  probability  measures.)  Another 
example  is  provided  by  the  Popper  space  considered  in  Example  3.4.  This  corresponds 
to  the  dimensionally  ordered  system  {/i^  :  /3  G  IV  U  {oo}},  where 

!0  if  max(f/)  <  n 
1  if  max(C7)  =  n 
oo  if  mtxx(U)  >  n, 

where  max([7)  is  taken  to  be  oo  if  U  is  cohnite. 

Krauss  [1968]  restricts  to  Popper  algebras  of  the  form  F  x  [F  —  {0});  this  allows  him 
to  simplify  and  generalize  Renyi’s  analysis.  Interestingly,  he  also  proves  a  representation 
theorem  in  the  spirit  of  Renyi’s  that  involves  nonstandard  probability. 

Van  Fraassen  [1976]  proves  a  result  whose  assumptions  are  somewhat  closer  to  The¬ 
orem  3.5.  Van  Fraassen  considers  what  he  calls  ordinal  families  of  probability  measures. 
An  ordinal  family  over  ( W,  F)  is  a  sequence  of  the  form  { ( Wfj ,  JPg,  fig)  :  /3  <  a}  such  that 

•  U  g<aWg  =  W ; 
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•  Fp  is  an  algebra  over  Wp; 

•  fip  is  a  probability  measure  with  domain  Fp; 

•  Up<aF g  F ', 

•  if  U  E  F  and  V  E  Fp,  then  U  C\V  E  Fp; 

•  \iU  E  T ,  U  DV  E  Fp,  and  Hp(U  fl  V)  >  0,  then  there  exists  7  such  that  U  E  F7 
and  /z7(F)  >  0. 

Given  an  ordinal  family  {{Wp,  Fp,  up)  :  (3  <  a}  over  (W,  F),  consider  the  map 
Fo^c  which  associates  with  it  the  cps  (W,F,F',ia),  where  F1  =  {U  E  F  :  /r7( U )  > 
0  for  some  7  <  a}  and  /a(V  \  U)  =  fip(V  \  U ),  where  (3  is  the  smallest  ordinal  such  that 
U  E  Fp  and  Hp{U )  >  0.  Van  Fraassen  shows  that  F0^c  is  a  Injection  from  ordinal 
families  over  (IF,  F)  to  Popper  spaces  over  (IF,  F).  Again,  for  van  Fraassen,  countable 
additivity  does  not  play  a  significant  role.  If  F  is  a  cr-algebra,  a  countably  additive  or¬ 
dinal  family  over  (IF,  F)  is  defined  just  as  an  ordinal  family,  except  that  now  Fp  is  a 
cr-algebra  over  Wp  for  all  (3  <  a,  p,a  is  a  countably  additive  probability  measure,  and  F  is 
the  least  cr-algebra  containing  U p<aFp  (since  U p<aFp  is  not  in  general  a  cr-algebra).  The 
same  map  Fq^c  is  also  a  bijection  from  countably  additive  ordinal  families  to  countably 
additive  Popper  spaces. 

Spohn’s  result,  Theorem  3.5,  can  be  viewed  as  a  strengthening  of  van  Fraassen’s 
result  in  the  countably  additive  case,  since  for  Theorem  3.5  all  the  Fp  s  are  required  to 
be  identical.  This  is  a  nontrivial  requirement.  The  fact  that  it  cannot  be  met  in  the  case 
that  W  is  infinite  and  the  measures  are  not  countably  additive  is  an  indication  of  this. 

It  is  worth  seeing  how  van  Fraassen’s  approach  handles  the  finitely  additive  examples 
which  do  not  correspond  to  SLPS’s.  The  Popper  space  in  Example  3.3  corresponds 
to  the  ordinal  family  {(Wn,Fn,/an)  :  n  <  u>}  where,  for  n  <  u,  Wn  =  {l,...,n}, 
Fn  consists  of  all  subsets  of  IF„,  and  /in  is  the  uniform  measure,  while  Ww  =  IV,  F^ 
consists  of  the  finite  and  cofinite  subsets  of  IV,  and  Hw(U)  is  1  if  U  is  cofinite  and  0  if 
U  is  finite.  It  is  easy  to  check  that  this  ordinal  family  has  the  desired  properties.  The 
Popper  space  in  Example  3.4  is  represented  in  a  similar  way,  using  the  ordinal  family 
{(Wn,  Fn-,  a4J  :  n  A  cu},  where  n'n(U )  is  1  if  n  E  U  and  0  otherwise,  while  /if,  =  /iw .  I 
leave  it  to  the  reader  to  see  that  this  family  has  the  desired  properties.  The  key  point 
to  observe  here  is  the  leverage  obtained  by  allowing  each  probability  measure  to  have  a 
different  domain. 


4  Relating  LPS’s  to  NPS’s 

In  this  section,  I  show  that  LPS’s  and  NPS’s  are  isomorphic  in  a  strong  sense.  Again,  I 
separate  the  results  for  the  finite  case  and  the  infinite  case. 
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4.1  The  finite  case 


Consider  an  LPS  of  the  form  (/t1?  fi2,  /X3).  Roughly  speaking,  the  corresponding  NPS 
should  be  (1  —  e  —  e2)pi  +  e/i2  +  e2/i3>  where  e  is  some  infinitesimal.  That  means  that 
p 2  gets  infinitesimal  weight  relative  to  // 1  and  gets  infinitesimal  weight  relative  to  /12. 
But  which  infinitesimal  e  should  be  chosen?  Intuitively,  it  shouldn’t  matter.  No  matter 
which  infinitesimal  is  chosen,  the  resulting  NPS  should  be  equivalent  to  the  original  LPS. 
I  now  make  this  intuition  precise. 

Suppose  that  we  want  to  use  an  LPS  or  an  NPS  to  compute  which  of  two  bounded, 
real-valued  random  variables  has  higher  expected  value. The  intended  application  here  is 
decision  making,  where  the  random  variables  can  be  thought  of  as  the  utilities  corre¬ 
sponding  to  two  actions;  the  one  with  higher  expected  utility  is  preferred.  The  idea  is 
that  two  measures  of  uncertainty  (each  of  which  can  be  an  LPS  or  an  NPS)  are  equivalent 
if  the  preference  order  they  place  on  (real  valued)  random  variables  (according  to  their 
expected  value)  is  the  same.  I  consider  only  random  variables  with  countable  range.  This 
restriction  both  makes  the  exposition  simpler  and  avoids  having  to  define,  for  example, 
integration  with  respect  to  an  NPS.  Note  that,  given  an  LPS  jl,  the  expected  value  of 
a  random  variable  X  is  J2X  xjl(X  =  x),  where  jl(X  =  x)  is  a  sequence  of  probability 
values  and  the  multiplication  and  addition  are  pointwise.  Thus,  the  expected  value  is  a 
sequence;  these  sequences  can  be  compared  using  the  lexicographic  order  <l  defined  in 
Section  2.2.  If  v  is  either  an  LPS  or  NPS,  then  let  Ev{ X)  denote  the  expected  value  of 
random  variable  X  according  to  //. 

Definition  4.1:  If  each  of  zq  and  is  either  an  NPS  over  (W,  X)  or  an  LPS  over  (W,  X), 
then  u\  is  equivalent  to  zq,  denoted  zq  ~  z/2,  if,  for  all  real- valued  random  variables  X 
and  Y  measurable  with  respect  to  X,  EUl( X)  <  EVI  (Y)  iff  EV2{ X)  <  EU2(Y ').  (If  X 
has  countable  range,  which  is  the  only  case  I  consider  here,  then  A"  is  measurable  with 
respect  to  F  iff  {w  :  X(w)  =  x}  G  X  for  all  x  in  the  range  of  A".)5  I 

This  notion  of  equivalence  satisfies  analogues  of  the  two  key  properties  of  the  map 
Fs^p  considered  at  the  beginning  of  Section  3. 

Proposition  4.2:  If  u  £  NPS(hL,  X),  ft  G  LPS(hP,  X),  and  v  ~  jl,  then  z/(I7)  >  0  iff 
jl(U )  >  0  Moreover,  if  v{U)  >  0,  then  st  (z/(P  |  U))  =  pj(V\U),  where  pj  is  the  first 
probability  measure  in  jl  such  that  pj(U)  >  0. 

5 As  pointed  out  by  Adam  Brandenburger  and  Eddie  Dekel,  this  notion  of  equivalence  is  essentially 
the  same  as  one  implicitly  used  by  BBD.  They  work  with  preference  orders  on  Anscombe-Aumann  acts 
[Anscombe  and  Aumann  1963],  that  is,  functions  from  states  to  probability  measures  on  prizes.  Fix  a 
utility  function  u  on  prizes.  Then  take  z/2  if  the  preference  order  on  acts  generated  by  V\  and 

u  is  the  same  as  that  generated  by  z/2  and  u.  It  is  not  hard  to  show  that  this  notion  of  equivalence  is 
independent  of  the  choice  of  utility  function;  if  u  and  u'  are  two  utility  functions  on  prizes,  then  v\  z/2 
iff  zq  z/2.  Moreover,  zq  z/2  iff  zq  «  z/2.  The  advantage  of  the  notion  of  equivalence  used  here  is 
that  it  is  defined  without  the  overhead  of  preference  orders  on  acts. 
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As  the  next  result  shows,  for  SLPS’s,  the  ^-equivalence  classes  are  singletons,  even 
if  the  set  of  worlds  is  infinite.  (This  is  not  true  for  LPS’s  in  general.  For  example, 
(/q/i)  m  (//).)  This  can  be  viewed  as  providing  more  motivation  for  the  use  of  SLPS’s. 

Proposition  4.3:  If  jl,  jl'  G  SLPS(bF,  IF),  then  jl  ~  jl'  iff  jl  =  jl' . 

The  next  result  justifies  restricting  to  finite  LPS’s  if  the  state  space  is  finite.  Given 
an  algebra  J7,  let  BasicfJ r)  consist  of  the  basic  sets  in  IF,  that  is,  the  nonempty  sets 
T  that  themselves  contain  no  nonempty  subsets  in  J7.  Clearly  the  sets  in  Basic(IF)  are 
disjoint,  so  that  \Basic(lF)\  <  \W\.  If  all  sets  are  measurable,  then  Basic(lF)  consists  of 
the  singleton  subsets  of  XV.  If  XV  is  finite,  it  is  easy  to  see  that  all  sets  in  T  are  finite 
unions  of  the  sets  in  Basic(lF). 

Proposition  4.4:  If  W  is  finite,  then  every  LPS  over  (XV,  J7)  is  equivalent  to  an  LPS 
of  length  at  most  \Basic(IF)\. 

I  can  now  define  the  Injection  that  relates  NPS’s  and  LPS’s.  Given  (XV,  J7),  let 
LPS (W,  J7) / ~  be  the  equivalence  classes  of  ^-equivalent  LPS’s  over  (W,  J7)',  similarly, 
let  NPS  (XV,  J7) / ~  be  the  equivalence  classes  of  ^-equivalent  NPS’s  over  (XV,  J7).  Note 
that  in  NPS  (XV,  J7)/  ~,  it  is  possible  that  different  nonstandard  probability  measures 
could  have  different  ranges.  For  this  section,  without  loss  of  generality,  I  could  also 
fix  the  range  of  all  NPS’s  to  be  the  nonstandard  model  M(e)  discussed  in  Section  2.3. 
However,  in  the  infinite  case,  it  is  not  possible  to  restrict  to  a  single  nonstandard  model, 
so  I  do  not  do  so  here  either,  for  uniformity. 

Now  define  the  mapping  FL^N  from  LPS (XV,  J7) /~  to  NPS(XV,  JF)/~  pretty  much  as 
suggested  at  the  beginning  of  this  subsection:  If  [jl]  is  an  equivalence  class  of  LPS’s,  then 
choose  a  representative  jl'  G  \jl]  with  finite  length.  Fix  an  infinitesimal  e.  Suppose  that 
jl'  =  (no, . .  .,Hk)-  Let  FL^N([fi\)  =  [(1  -  e - ek)p0  +  e/ii  H - b  ekfik}. 

Theorem  4.5:  If  XV  is  finite,  then  FL^N  is  a  bijection  from,  LPS(bF,  J7) /«  to  NPS(bF,  J7)/^ 
that  preserves  equivalence  (that  is,  each  NPS  in  Fr^v([p])  is  equivalent  to  jl). 

Proof:  It  is  easy  to  check  that  if  jl  =  (/i0, . . . ,  pk),  then  jl  «  (1  —  e  —  •  •  •  —  ek)p0  + 
e/ii  +  •  •  •  +  ek/ak  (see  Lemma  A. 7  in  the  appendix  for  a  formal  proof).  It  follows  that 
Fl^n  is  an  injection  from  LPS (XV,  JF)/~  to  NPS (XV,  lF)/~.  To  show  that  FL^N  is  a 
surjection,  we  must  essentially  construct  an  inverse  map;  that  is,  given  an  NPS  ( IF,  T ,  v) 
where  XV  is  finite,  we  must  find  an  LPS  jl  such  that  jl  ~  v.  The  idea  is  to  find  a  finite 
collection  /i0, . . . ,  pk  of  (standard)  probability  measures,  where  k  <  \XV\,  and  nonnegative 
nonstandard  reals  e0, . . .  ,ek  such  that  st  (cj+i/e*)  =  0  and  v  =  e0po  +  •  •  •  +  ekpk.  A 
straightforward  argument  then  shows  that  v  «  jl  and  Fl-> jv ([/*])  —  \ff\-  I  leave  details  to 
the  appendix.  | 
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BBD  [1991a]  also  relate  nonstandard  probability  measures  and  LPS’s  under  the  as¬ 
sumption  that  the  state  space  is  finite,  but  there  are  some  significant  technical  differences 
between  the  way  they  relate  them  and  the  approach  taken  here.  BBD  prove  representa¬ 
tion  theorems  essentially  showing  that  a  preference  order  on  lotteries  can  be  represented 
by  a  standard  utility  function  on  lotteries  and  an  LPS  iff  it  can  be  represented  by  a  stan¬ 
dard  utility  function  on  lotteries  and  an  NPS.  Thus,  they  show  that  NPS’s  and  LPS’s 
are  equiexpressive  in  terms  of  representing  preference  orders  on  lotteries.  The  difference 
between  BBD’s  result  and  Theorem  4.5  is  essentially  a  matter  of  quantification.  BBD’s 
result  can  be  viewed  as  showing  that,  given  an  LPS,  for  each  utility  function  on  lotteries, 
there  is  an  NPS  that  generates  the  same  preference  order  on  lotteries  for  that  particular 
utility  function.  In  principle,  the  NPS  might  depend  on  the  utility  function.  More  pre¬ 
cisely,  for  a  fixed  LPS  jl ,  all  that  follows  from  their  result  is  that  for  each  utility  function 
u,  there  is  an  NPS  v  such  that  (jl,  u)  and  (v,  u)  generate  the  same  preference  order  on 
lotteries.  Theorem  4.5  says  that,  given  jl,  there  is  an  NPS  v  such  that  (jl,  u)  and  (v,  u) 
generate  the  same  preference  on  lotteries  for  all  utility  functions  u. 

4.2  The  infinite  case 

An  LPS  over  an  infinite  state  space  W  may  not  be  equivalent  to  any  finite  LPS.  However, 
ideas  analogous  to  those  used  to  prove  Proposition  4.4  can  be  used  to  provide  a  bound 
on  the  length  of  the  minimal- length  LPS’s  in  an  equivalence  class. 

Proposition  4.6:  Every  LPS  over  (W,  IF)  is  equivalent  to  an  LPS  over  (IP,  T)  of  length 
at  most  \T\. 

The  first  step  in  relating  LPS’s  to  NPS’s  is  to  show  that,  just  as  in  the  finite  case, 
for  every  LPS  (fig  :  (3  <  a)  of  length  a,  there  is  an  equivalent  NPS  v.  The  idea  will  be 
to  set  v  —  (1  —  <g<aen/3)  +  Yt,o<i3<aenpPg-  In  the  finite  case,  we  could  take  rig  =  (3. 

This  worked  because  each  f3  was  finite,  and  the  held  M(e)  includes  eJ  for  each  integer 
j.  But  now,  since  a  may  be  greater  than  u,  we  cannot  just  take  ng  =  (3.  To  get  this 
idea  to  work  in  the  infinite  setting,  I  consider  a  nonstandard  model  of  the  integers,  which 
includes  an  “integer”  corresponding  to  all  the  ordinals  less  than  a.  I  then  construct  a 
held  that  includes  en“  even  for  these  nonstandard  integers  na . 

A  nonstandard  model  of  the  integers  is  a  model  that  contains  the  integers  and  satishes 
every  property  of  the  integers  expressible  in  hrst-order  logic.  It  follows  easily  from  the 
compactness  theorem  of  hrst-order  logic  [Enderton  1972]  that,  given  an  ordinal  a,  there 
exists  a  nonstandard  model  Ia  of  the  integers  Ia  that  includes  elements  ng,  (3  <  a, 
such  that  rij  =  j  for  j  <  u>  and  rig  <  ng>  if  (3  <  (3'.  (Note  that  since  1°  satishes  all  the 
properties  of  the  integers,  it  follows  that  if  rig  <  ng',  then  ng>  —  ng>  1,  a  fact  that  will  be 
useful  later.)  The  compactness  theorem  says  that,  given  a  collection  of  formulas,  if  each 
finite  subset  has  a  model,  then  so  does  the  whole  set.  Consider  a  language  with  a  function 
+  and  constant  symbols  for  each  integer,  together  with  constants  n g,  (3  <  a.  Consider 


18 


the  collection  of  first-order  formulas  in  this  language  consisting  of  all  the  formulas  true 
of  the  integers,  together  with  the  formulas  n,  =  i  for  i  <  u  and  <  np>,  for  all 
f3  <  (3'  <  a.  Clearly  any  finite  subset  of  this  set  has  a  model — namely,  the  integers. 
Thus,  by  compactness,  so  does  the  full  set.  Thus,  for  each  ordinal  a,  there  is  a  model  /“ 
with  the  required  properties. 

Given  a,  I  now  construct  a  held  M(Ia)  that  includes  en  for  each  “integer”  n  G  Ia. 
To  explain  the  construction,  it  is  best  to  first  consider  M(e)  in  a  little  more  detail.  Since 
M(e)  is  a  held,  once  it  includes  e,  it  must  include  p(e),  where  p  is  a  polynomial  with  real 
coefficients.  To  ensure  the  every  nonzero  element  of  JR{e )  has  an  inverse,  we  need  not 
just  finite  polynomials  in  e,  but  infinite  polynomials  in  e.  The  inverse  of  a  polynomial 
in  e  can  then  be  computer  using  standard  “formal”  division  of  polynomials.  Moreover, 
the  leading  coefficient  of  the  polynomial  can  be  negative.  Thus,  the  inverse  of  e3  is,  not 
surprisingly,  e-3;  the  inverse  of  1  —  e  is  1  +  e  +  e2  +  . . .. 

The  held  JR(Ia)  also  includes  polynomials  in  e,  but  now  the  exponents  are  not  just 
integers,  but  elements  of  Ia.  Since  a  held  is  closed  under  multiplication,  if  it  contains 
eni  and  en2,  it  must  also  include  their  product.  Since  1°  satishes  all  the  properties  of 
the  integers,  if  it  includes  n ]  and  n2 ,  it  also  includes  an  element  ri\  +  n2,  and  we  can 
take  eni  x  ena  =  eni+n2.  Formally,  let  JR(Ia )  be  the  non- Archimedean  model  defined  as 
follows:  JR(Ia )  consists  of  all  polynomials  of  the  form  j  rn£n,  where  rn  is  a  standard 
real,  e  is  an  infinitesimal,  and  J  is  a  well-founded  subset  of  Ia.  (Recall  that  a  set  is 
well  founded  if  it  has  no  infinite  descending  sequence;  thus,  the  set  of  integers  is  not 
well  founded,  since  ...  —  3  <  —2  <  —  1  is  an  infinite  descending  sequence.  The  reason  I 
require  well  foundedness  will  be  clear  shortly.)  We  can  identify  the  standard  real  r  with 
the  polynomial  re0. 

The  polynomials  in  JR(Ia )  can  be  added  and  multiplied  using  the  standard  rules  for 
addition  and  multiplication  of  polynomials.  It  is  easy  to  check  that  the  result  of  adding 
or  multiplying  two  polynomials  is  another  polynomial  in  M(Ia).  In  particular,  if  p\  and 
p2  are  two  polynomials,  Ni  is  the  set  of  exponents  of  pi,  and  N2  is  the  set  of  exponents 
of  p‘2 ,  then  the  exponents  of  pi  +  p2  lie  in  TV)  U  N2,  while  the  exponents  of  p\p2  lie  in  the 
set  N3  =  {ni  +  n2  :  n i  G  Ni,n2  G  N2}.  Both  Ni  U  N2  and  N3  are  easily  seen  to  be  well 
founded  if  N\  and  N2  are.  Moreover,  for  each  expression  ni  +  n2  G  N:i.  it  follows  from  the 
well-foundedness  of  N\  and  N2  that  there  are  only  hnitely  many  pairs  (n,  n')  G  N\  x  A"2 
such  that  n  +  n'  =  n\  +n2,  so  the  coefficient  of  eni+n2  in  pip-2  is  well  defined.  Finally,  each 
polynomial  (other  than  0)  has  an  inverse  that  can  be  computed  using  standard  “formal” 
division  of  polynomials;  I  leave  the  details  to  the  reader.  This  step  is  where  the  well 
foundedness  comes  in.  The  formal  division  process  cannot  be  applied  to  a  polynomial 
with  coefficients  that  are  not  well  founded,  such  as  •  •  •  +  e”3  +  e~2  +  e~l .  An  element 
of  JR(Ia )  is  positive  if  its  leading  coefficient  is  positive.  Define  an  order  <  on  JR(Ia)  by 
taking  a  <  b  if  b  ~-  a  is  positive.  With  these  definitions,  JR(Ia)  is  a  non-Archimedean 
held. 
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Given  (IT,  F).  let  a  be  the  minimal  ordinal  whose  cardinality  is  greater  than  or 
equal  to  \F\.  By  construction,  1°  has  elements  np  for  all  j3  <  a  such  that  rq  =  i  for 
i  <  u  and  np  <  np >  if  (3  <  (3'  <  a.  1  now  dehne  a  map  Fl^n  from  LPS(W,  F) /  ~ 
to  NPS(W,F)/~  just  as  suggested  earlier.  In  more  detail,  given  an  equivalence  class 
[jl]  G  LPS(W,  F),  by  Proposition  4.6,  there  exists  jl'  G  [jl]  such  that  jl'  has  length 
a'  <  a.  Let  v  =  (1  -  Eo<p<a  en/3)ho  +  Eo<p<a  ^n^'p-  By  definition,  Eo <p<a  e”'9  e  M(Ia) 
(the  set  of  exponents  is  well  ordered  since  the  ordinals  are  well  ordered),  hence  so  is 
(1  —  Eo</?<a  F10).  The  elements  en 0  for  [3  <  a  are  also  all  in  It  easily  follows 

that  v  is  nonstandard  probability  measure  over  the  held  JR(Ia).  As  observed  earlier,  if 
/ 3 '  <  j3,  then  (3  —  @'  >  1,  so  enf}  is  infinitesimally  smaller  than  enA  Arguments  essentially 
identical  to  those  of  Lemma  A. 7  in  the  appendix  can  be  used  to  show  that  v  «  jl/ .  Dehne 
FL^N[jl]  =  [v\-  The  following  result  is  immediate. 

Theorem  4.7:  FL^N  is  an  injection  from  LPS(IT,  F) /~  to  NPS(hL,jF)/~  that  preserves 
equivalence. 

What  about  the  converse?  Is  it  the  case  that  for  every  NPS  there  is  an  equivalent 
LPS?  The  technique  for  finding  an  equivalent  LPS  used  in  the  finite  case  fails.  There  is  no 
obvious  way  to  find  a  well-ordered  sequence  of  standard  probability  measures  /x0,  A*i,  -  -  - 
and  a  sequence  of  nonnegative  nonstandard  reals  eo,  ei, . . .  such  that  st  (ep+\/ep)  =  0  and 
v  =  eo/io  +  Dhi  +  •  •  ••  As  the  following  example  shows,  this  is  not  an  accident.  There 
exists  NPSs  that  are  not  equivalent  to  any  LPS. 

Example  4.8:  As  in  Example  3.3,  let  W  =  IV,  the  natural  numbers,  let  F  consist  of 
the  finite  and  cohnite  subsets  of  IV,  and  let  F'  =  F  —  {0}.  Let  v1  be  an  NPS  with  range 
IR(e),  where  z/1( U )  =  \U\e  if  U  is  finite  and  =  1  —  \U\e  if  U  is  cohnite  (as  usual,  U 

denotes  the  complement  of  U,  which  in  this  case  is  finite).  This  is  clearly  an  NPS,  and 
it  corresponds  to  the  cps  p1  of  Example  3.3,  in  the  sense  that  st(F(V  |  U))  =  pl(V  \  U ) 
for  all  V  G  F,  U  G  F' .  Just  as  in  Example  3.3,  it  can  be  shown  that  there  is  no  LPS  jl 
such  that  v1  «  jl. 

To  see  the  potential  relevance  of  this  setup,  suppose  that  a  natural  number  is  chosen 
at  random  and,  intuitively,  all  numbers  are  equally  likely  to  be  chosen.  An  agent  may 
place  a  bet  on  the  number  being  in  a  finite  or  cohnite  set.  Intuitively,  the  agent  should 
prefer  a  bet  on  a  set  with  larger  cardinality.  More  precisely,  if  U i  and  U-2  are  two  sets  in 
the  algebra,  the  agent  should  prefer  a  bet  on  U\  over  a  bet  on  U2  iff  (a)  U\  and  U2  are 
both  cohnite  and  the  complement  of  U 1  has  smaller  cardinality  than  that  of  f/2 ,  (b)  U \ 
is  cohnite  and  U2  is  finite,  or  (c)  U\  and  U2  are  both  finite,  and  U\  has  larger  cardinality 
than  U 2-  These  preferences  on  acts  or  bets  should  translate  to  statements  of  likelihood. 
The  NPS  captures  these  preferences  directly;  they  cannot  be  captured  in  an  LPS.  The 
cps  of  Example  3.3  captures  (b)  directly,  and  (c)  indirectly:  when  conditioning  on  any 
finite  set  that  contains  U\  U  f/2,  the  probability  of  U\  will  be  higher  than  that  of  LE  I 
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4.3  Countably  additive  nonstandard  probability  measures 

Do  things  get  any  better  if  countable  additivity  is  required?  To  answer  this  ques¬ 
tion,  I  must  first  make  precise  what  countable  additivity  means  in  the  context  of  non- 
Archimedean  fields.  To  understand  the  issue  here,  recall  that  for  the  standard  real 
numbers,  every  bounded  nondecreasing  sequence  has  a  unique  least  upper  bound,  which 
can  be  taken  to  be  its  limit.  Given  a  countable  sum  each  of  whose  terms  is  nonnega¬ 
tive,  the  partial  sums  form  a  nondecreasing  sequence.  If  the  partial  sums  are  bounded 
(which  they  are  if  the  terms  in  the  sums  represent  the  probabilities  of  a  pairwise  disjoint 
collection  of  sets),  then  the  limit  is  well  defined. 

None  of  the  above  is  true  in  the  case  of  non-Archimedean  fields.  For  a  trivial  coun¬ 
terexample,  consider  the  sequence  e,  2e,  3e, . . ..  Clearly  this  sequence  is  bounded  (by  any 
positive  real  number),  but  it  does  not  have  a  least  upper  bound.  For  a  more  subtle 
example,  consider  the  sequence  1/2,  3/4,  7/8, .. .  in  the  held  IR(e).  Should  its  limit  be  1? 
While  this  does  not  seem  to  be  an  unreasonable  choice,  note  that  1  is  not  the  least  upper 
bound  of  the  sequence.  For  example,  1  —  e  is  greater  than  every  term  in  the  sequence, 
and  is  less  than  1.  So  are  1  —  3e  and  1  —  e2.  Indeed,  this  sequence  has  no  least  upper 
bound  in  IR(e). 

Despite  these  concerns,  I  define  limits  in  JR(I*)  pointwise.  That  is,  a  sequence 
ai,a2,a3, . . .  in  !?(/*)  converges  to  b  e  ]R(I*)  if,  for  every  n  G  /*,  the  coefficients  of 
en  in  a1,a2,a3, . . .  converge  to  the  coefficient  of  en  in  b.  (Since  the  coefficients  are  stan¬ 
dard  reals,  the  notion  of  convergence  for  the  coefficients  is  just  the  standard  definition  of 
convergence  in  the  reals.  Of  course,  if  en  does  not  appear  explicitly,  its  coefficient  is  taken 
to  be  0.)  Note  that  here  and  elsewhere  I  use  the  letters  a  and  b  (possibly  with  subscripts) 
to  denote  (standard)  reals,  and  e  to  denote  an  infinitesimal.  As  usual,  ai  is  taken  to 
be  b  if  the  sequence  of  partial  sums  Y%=  i  ai  converges  to  b.  Note  that,  with  this  notion  of 
convergence,  1/2,  3/4,  7/8, . . .  converges  to  1  even  though  1  is  not  the  least  upper  bound 
of  the  sequence.6  I  discuss  the  consequences  of  this  choice  further  in  Section  7. 

With  this  notion  of  countable  sum,  it  makes  perfect  sense  to  consider  countably- 
additive  nonstandard  probability  measures.  If  T  is  a  cr-algebra  and  LPSC(W,  T)  and 
NPSC(W,  T)  denote  the  countably  additive  LPS’s  and  NPS’s  on  (W,  J7),  respectively, 
then  Theorem  4.7  can  be  applied  with  no  change  in  proof  to  show  the  following. 

Theorem  4.9:  FL^N  is  an  injection  from  LPSc(hF,  F)/~  to  NPSC(IF,  F)/~- 

However,  as  the  following  example  shows,  even  with  the  requirement  of  countable 
additivity,  there  are  nonstandard  probability  measures  that  are  not  equivalent  to  any 
LPS. 

Example  4.10:  Let  W  =  (wi,  w2,  uq, . . .},  and  let  T  =  2W .  Choose  any  nonstandard 
I*  and  fix  an  infinitesimal  e  in  Define  an  NPS  (W,  J- ,  u)  with  range  !?(/*)  by 

6For  those  used  to  thinking  of  convergence  in  topological  terms,  what  is  going  on  here  is  that  the 
topology  corresponding  to  this  notion  of  convergence  is  not  Hausdorff. 
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taking  v(wf)  =  aj  +  bje,  where  aj  =  l/2\  b2j-\  =  1/2-7-1,  and  b2j  =  —  1  / 2J— 1 ,  for 
j  =  1,2,3, . . ..  Thus,  the  probabilities  of  wi,w2,  ■  ■ .  are  characterized  by  the  sequence 
1/2  +  e,  1/4  —  e,  1/8  +  e/2, 1/16  —  e/2, 1/32  +  e/4, . . ..  For  U  C  W,  define  v(U)  = 
J2{j:Wjeu}  aj  +  eYh{j:Wjeu}  bj-  It  is  easy  to  see  that  these  sums  are  well-defined.  These 
likelihoods  correspond  to  preferences.  For  example,  an  agent  should  prefer  a  bet  that 
gives  a  payoff  of  1  if  w2  occurs  and  0  otherwise  to  a  bet  that  gives  a  payoff  of  4  if  w4 
occurs  and  0  otherwise.  As  I  show  in  the  appendix  (see  Proposition  A. 9),  there  is  no 
LPS  jl  over  ( W ,  F)  such  that  v  ~  jl.  I 

Roughly  speaking,  the  reason  that  v  is  not  equivalent  to  any  LPS  in  Example  4.10 
is  that  the  ratio  between  a3  and  bj  in  the  definition  of  v  (i.e.,  the  ratio  between  the 
“standard  part”  of  v{wj)  and  the  “infinitesimal  part”  of  v{wj))  goes  to  zero.  This  can 
be  generalized  so  as  to  give  a  condition  on  nonstandard  probability  measures  that  is 
necessary  and  sufficient  to  guarantee  that  they  can  be  represented  by  an  LPS.  However, 
the  condition  is  rather  technical  and  I  have  not  found  an  interesting  interpretation  of  it, 
so  I  do  not  pursue  it  here. 


5  Relating  Popper  Spaces  to  NPS’s 

Consider  the  map  FN^P  from  nonstandard  probability  spaces  to  Popper  spaces  such  that 
Fn^p(W,  F,  v)  =  (IF,  F,  F\  At),  where  F'  =  {U  :  u(U)  ±  0}  and  p(V  \  U )  =  st(u(V  \  U)) 
for  V  G  F,  U  G  F' .  I  leave  it  to  the  reader  to  check  that  ( IF,  F,  F\  /i)  is  indeed  a  Popper 
space.  This  is  arguably  the  most  natural  map;  for  example,  it  is  easy  to  check  that 
Fn^p  o  Fs^n  =  Fs^p,  where  Fs^n  is  the  restriction  of  Fl^n  to  SLPSs.  (Note  that 
Fl^n  is  well-defined  on  SLPS’s,  since  if  jl  is  an  SLPS,  by  Proposition  4.3,  \jl\  =  {/x}.) 

We  might  hope  that  FN^P  is  a  bijection  from  NPS(W,  JF)/~  to  Pop(W,  F).  As  I 
show  shortly,  it  is  not.  To  understand  Fl^n  better,  define  an  equivalence  relation  ~  on 
NPS(W,F)  (and  NPSC(W,  F))  by  taking  vx  ~  u2  if  {U  :  vx (U)  =  0}  =  {U  :  u2{U)  =  0} 
and  st(i/i(V  \  U))  =  st(u2(V  \  U))  for  all  V,  U  such  that  vx(U)  ^  0.  Thus,  ~  essentially 
says  that  inhnitesimal  differences  between  conditional  probabilities  do  not  count.  Let 
NPS /  ~  (resp.,  NPSC /  ~)  consist  of  the  ~  equivalence  classes  in  NPS  (resp.,  NPSC ). 
Clearly  Fpi^p  is  well  defined  as  a  map  from  NPS /  ~  to  Pop(W,  F)  and  from  NPSC /  ~  to 
Popc(W,  F).  As  the  following  result  shows,  FN^P  is  actually  a  bijection  from  NPSC /  ~ 
to  Popc(W,F). 

Theorem  5.1:  FN^P  is  a  bijection  from  NPS(kF,  F)/~to  Pop(kF,  F)  and  from  NPSc(hF,  F)/ ~ 
to  Popc(kF,  F). 

Proof:  It  is  easy  to  see  that  T/v^p  is  an  injection.  In  the  countable  case,  the  inverse 
map  can  be  defined  using  earlier  results.  If  (IF,  F ,  F\  p)  G  Popc(W,  F),  by  Theorem  3.5, 
there  is  a  countably  additive  SLPS  jl  such  that  Fs^p((W,  F,  jl'))  =  (W,F,F',p).  By 
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Theorem  4.7,  there  is  some  (W,  F,is)  e  NPSC{W,  T)  such  that  v  ~  jl' .  It  is  not  hard 
to  show  that  Fjf->p(W,  F)  u)  =  (W,  F,  //) ;  see  the  appendix  for  details.  Showing  that 
is  a  surjection  in  the  finitely  additive  case  requires  more  work;  again,  see  the 
appendix  for  details.  | 

McGee  [1994]  proves  essentially  the  same  result  as  Theorem  5.1  in  the  case  that  T  is 
an  algebra  (and  the  measures  involved  are  not  necessarily  countably  additive).  McGee 
[1994,  p.  181]  says  that  his  result  shows  that  “these  two  approaches  amount  to  the  same 
thing”.  However,  this  is  far  from  clear.  The  ~  relation  is  rather  coarse.  In  particular,  it 
is  coarser  than 

Proposition  5.2:  If  v i  «  zq  then  zq  ~  zq. 

The  converse  of  Proposition  5.2  does  not  hold  in  general.  As  a  result,  the  ~  relation 
identifies  nonstandard  measures  that  behave  quite  differently  in  decision  contexts.  This 
difference  already  arises  in  finite  spaces,  as  the  following  example  shows. 

Example  5.3:  Suppose  W  =  {wi,w2}.  Consider  the  nonstandard  probability  measure 
zq  such  that  zq(wq)  =  1/2  +  e  and  zq( w2)  =  1/2  —  e.  (This  is  equivalent  to  the  LPS  (/q,/q) 
where  /q(uq)  =  /n2(w2)  =  1/2,  /i2(wq)  =  1,  and  /n2(w2)  =  0.)  Let  zq  be  the  nonstandard 
probability  measure  such  that  zq(wq)  =  zq (w2)  =  1/2.  Clearly  zq  ~  zq.  However,  it  is 
not  the  case  that  zq  ~  zq.  Consider  the  two  random  variables  X{w q  and  X{w2}-  (I  use 
the  notation  xu  to  denote  the  indicator  function  for  U ;  that  is,  Xu(w)  —  1  if  w  G  U  and 
Xu(w )  =  0  otherwise.)  According  to  zq,  the  expected  value  of  X{wi}  is  (very  slightly) 
higher  than  that  of  X{w2}-  According  to  zq,  X{w q  and  X{w2}  have  the  same  expected 
value.  Thus,  U\  ^  zq.  Moreover,  it  is  easy  to  see  that  there  is  no  Popper  measure  /j  on 
{wi,w2}  that  can  make  the  same  distinctions  with  respect  to  X{w!}  and  X{w2}  as  z/i,  no 
matter  how  we  define  expected  value  with  respect  to  a  Popper  measure.  According  to 
zq,  although  the  expected  value  of  X{wi}  is  higher  than  that  of  X{w2 },  the  expected  value 
of  X{u>!}  is  less  than  that  of  aX{w2}  f°r  any  (standard)  real  a  >  1.  There  is  no  Popper 
measure  with  this  behavior.  | 

More  generally,  in  finite  spaces,  Theorem  3.1  shows  that  Popper  spaces  are  equivalent 
to  SLPS’s,  while  Theorem  4.5  shows  that  LPS(W,  F)/&  is  equivalent  to  NPS(W,F )/«. 
By  Proposition  4.3,  SLPS(W,F)/~  is  essentially  identical  to  SLPS(W:F)  (all  the  equiv¬ 
alence  classes  in  SLPS(W,F)/~  are  singletons),  so  in  finite  spaces,  the  gap  in  expres¬ 
sive  power  between  Popper  spaces  and  NPS’s  essentially  amounts  to  the  gap  between 
SLPS(W,F )  and  LPS(W,F)/&.  This  gap  is  nontrivial.  For  example,  there  is  no  SLPS 
equivalent  to  the  LPS  (/q,/q)  that  represents  the  NPS  in  Example  5.3. 


6  Independence 

The  notion  of  independence  is  fundamental.  As  I  show  in  this  section,  the  results  of 
the  previous  sections  sheds  light  on  various  notions  of  independence  considered  in  the 
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literature  for  LPS’s  and  (variants  of)  cps’s.  I  first  consider  independence  for  events  and 
then  independence  for  random  variables.  1  then  relate  my  definitions  to  those  of  BBD, 
Hammond,  and  Kohlberg  and  Reny  [1997]. 

Intuitively,  event  U  is  independent  of  V  if  learning  U  gives  no  information  about 
V.  Certainly  if  learning  U  gives  no  information  about  V,  then  if  ju  is  an  arbitrary 
probability  measure,  we  would  expect  that  ju(V  |  U)  =  Indeed,  this  is  often  taken 

as  the  definition  of  V  being  independent  of  U  with  respect  to  /i.  If  standard  probability 
measures  are  used,  conditioning  is  not  defined  if  =  0.  In  this  case,  U  is  still 

considered  independent  of  V.  As  is  well  known,  if  U  is  independent  of  V,  then  ju(U flV)  = 
ju(V)  x  ju(U)  and  V  is  independent  of  U,  that  is,  ju(U  |  V )  =  ju(U).  Thus,  independence 
of  events  with  respect  to  a  probability  measure  can  be  defined  in  any  of  three  equivalent 
ways.  Unfortunately,  these  definitions  are  not  equivalent  for  other  representations  of 
uncertainty  (see  [Halpern  2003,  Chapter  4]  for  a  general  discussion  of  this  issue). 

The  situation  is  perhaps  simplest  for  nonstandard  probability  measures.7  In  this 
case,  the  three  notions  coincide,  for  exactly  the  same  reasons  as  they  do  for  standard 
probability  measures.  However,  independence  is  perhaps  too  strong  a  notion  in  some 
ways.  In  particular,  nonstandard  measures  that  are  equivalent  do  not  in  general  agree 
on  independence,  as  the  following  example  shows. 

Example  6.1:  Suppose  that  W  =  (wi,  w2,  uq,  W4}.  Let  zy(tci)  =  1  —  2e  +  e;,  Vi(w2)  = 
Vi(w 3)  =  e  —  e*,  and  r'j(uq)  =  e*,  for  i  —  1,2,  where  =  e2  and  e2  =  e3.  If  U  —  {w2,  uq } 
and  V  =  {w3,  w4},  then  i'i(U)  =  zy(U)  =  e  and  zy(t/  fl  V)  —  et.  It  follows  U  and  V  are 
independent  with  respect  to  U\ ,  but  not  with  respect  to  v2-  However,  it  is  easy  to  check 
that  //]  «  v2.  | 

Example  6.1  shows  that  independence  of  events  in  the  context  of  nonstandard  mea¬ 
sures  is  very  sensitive  to  the  choice  of  e,  even  if  this  choice  does  not  affect  decision 
making  at  all.  This  suggests  the  following  definition:  U  is  approximately  independent  of 
V  with  respect  to  v  if  v(U)  ^  0  implies  that  u(V  \  U )  —  v(V)  is  infinitesimal,  that  is, 
if  st{v{V  |  U))  =  st(u(V)).  Note  that  U  can  be  approximately  independent  of  V  with¬ 
out  V  being  approximately  independent  of  U.  For  example,  consider  the  nonstandard 
probability  measure  v\  from  Example  6.1.  Let  V'  =  {^4};  as  before,  let  U  =  {^2,^4}. 
It  is  easy  to  check  that  st{yi(V'  \  U))  =  st{v i(V'))  =  0,  but  st(ui(U  \  V'))  =  1,  while 
st{v\{U ))  =  0.  Thus,  U  is  approximately  independent  of  V'  with  respect  to  V\ ,  but  V' 
is  not  approximately  independent  of  U.  Similarly,  U  can  be  approximately  independent 
of  V  without  U  being  approximately  independent  of  V.  For  example,  it  is  easy  to  check 
that  V  is  approximately  independent  of  U  with  respect  to  V\ ,  although  V'  is  not. 

A  straightforward  argument  shows  that  U  is  approximately  independent  of  V  with 
respect  to  v  iff  v{U)  ^  0  implies  st((n(V  DU)  —  n(V)  x  v{U))/v{U))  =  0,  while  V  is 

7Although  I  talk  about  U  being  independent  of  V  with  respect  to  a  nonstandard  measure  is,  technically 
I  should  talk  about  U  being  independent  of  V  with  respect  to  an  NPS  (IF,  T ,  is),  for  U,V  €  T .  I  continue 
to  be  sloppy  at  times,  reverting  to  more  careful  notation  when  necessary. 
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approximately  independent  of  U  with  respect  to  v  iff  the  same  statement  holds  with 
the  roles  of  V  and  U  reversed.  Note  for  future  reference  that  each  of  these  require¬ 
ments  is  stronger  than  just  requiring  that  st(v(V  DU)  —  v(V)  x  v(U))  =  0.  The  latter 
requirement  is  automatically  met,  for  example,  if  the  probability  of  either  U  or  V  is 
infinitesimal. 

The  definition  of  (approximate)  independence  extends  in  a  straightforward  way  to 
(approximate)  conditional  independence.  U  is  conditionally  independent  of  V  given  V' 
with  respect  to  a  (standard  or  nonstandard)  probability  measure  v  if  u{U flV7)  7^  0  implies 
is(V  |  U  fl  V')  =  v(V  |  V').  Again,  for  probability,  U  is  conditionally  independent  of  V 
given  V'  iff  V  is  conditionally  independent  of  U  given  V'  iff  u(V  fl  U  \  V')  =  u(V  \  V')  x 
v(U  |  V').  U  is  approximately  conditionally  independent  of  V  given  V'  with  respect  to 
v  if  st(y{V  |  U  fl  V'))  =  st( u(V  \  V ')).  If  V'  is  taken  to  be  W,  the  whole  space,  then 
(approximate)  conditional  independence  reduces  to  (approximate)  independence. 

The  following  proposition  shows  that,  although  independence  is  not  preserved  by 
equivalence,  approximate  independence  is. 

Proposition  6.2:  If  U  is  approximately  conditionally  independent  of  V  given  V'  with 
respect  to  v,  and  v  fa  v' ,  then  U  is  approximately  conditionally  independent  of  V  given 
V'  with  respect  to  v' . 

Proof:  Suppose  that  v  ~  1/ .  I  claim  that  for  all  events  U 1  and  U2  such  that  v^Uf)  7^  0, 
st(y(U\)/v(U2))  =  st{y'{Ui)/p'{U2)).  For  suppose  that  st (v(fU\)/v(U2))  =  ol.  Then  it 
easily  follows  that  Ev(xih)  <  Eu(a'xu2)  f°r  a>  >  a >  and  Ev(xui)  >  Eu(a”xu2)  for  all 
a"  <  a.  Thus,  the  same  must  be  true  for  Eui,  and  hence  st {y' {U 1) / v' {U 2))  =  ol.  It  thus 
follows  that  st  (z/(P  |  U  fl  V'))  =  st  {v'(V  \  U  fl  V'))  and  st{v(V\V'))  =  st(u'(V  \V')), 
from  which  the  result  is  immediate.  I 

There  is  an  obvious  definition  of  independence  for  events  for  Popper  spaces:  U  is 
independent  of  V  given  V'  with  respect  to  the  Popper  space  (IF,  IF,  IF' ,  p)  if  U  fl  V'  G  IF' 
implies  that  p(V  \  U  fl  V')  =  p(V  \V');  if  U  fl  V'  £  F' ,  then  U  is  also  taken  to  be 
independent  of  V  given  V' .  If  U  is  independent  of  V  given  V'  and  V  G  F' ,  then 
p(U  fl  V  |  V')  =  p(U  |  V')  x  p{V  |  V').  However,  the  converse  does  not  necessarily  hold. 
Nor  is  it  the  case  that  if  U  is  independent  of  V  given  V'  then  V  is  independent  of  U 
given  V .  A  counterexample  can  be  obtained  by  taking  the  Popper  space  arising  from 
the  NPS  in  Example  6.1.  Consider  the  Popper  space  (IF,  2W ,  F,  p)  corresponding  to  the 
NPS  (IF,  2W,  vf)  via  the  bijection  FN^p.  It  is  easy  to  check  that  U  is  independent  of  V' 
but  V  is  not  independent  of  U  with  respect  to  this  Popper  space,  although  piV'  DU)  — 
p(U  |  V')  x  p(V')  (=  0).  This  observation  is  an  instance  of  the  following  more  general 
result,  which  is  almost  immediate  from  the  definitions: 

Proposition  6.3:  U  is  approximately  independent  of  V  given  V'  with  respect  to  the 
NPS  (IF,  F ,  v)  iff  U  is  independent  of  V  given  V'  with  respect  to  the  Popper  space 
FN^p(W,  F ,  v).  ‘ 
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How  should  independence  be  defined  in  LPS’s?  Interestingly,  neither  BBD  nor  Ham¬ 
mond  define  independence  directly  for  LPS’s.  However,  they  do  give  definitions  in  terms 
of  NPS’s  that  can  be  applied  to  equivalent  LPS’s;  indeed,  BBD  [1991b]  do  just  this  (see 
the  discussion  of  BBD  strong  independence  below). 

I  now  consider  independence  for  random  variables.  If  X  is  a  random  variable  on  W, 
let  V(X)  denote  range  (set  of  possible  values)  of  random  variable  X;  that  is,  V(X)  = 
(A"(tc)  :  w  G  W}.  Recall  that  I  am  assuming  that  all  random  variables  have  countable 
range.  Random  variable  X  is  independent  of  Y  with  respect  to  a  standard  probability 
measure  y  if  the  event  A"  =  x  is  independent  of  the  event  Y  =  y  with  respect  to  //,  for  all 
x  G  V(X)  and  y  G  V{Y).  By  analogy,  for  nonstandard  probability  measures,  following 
Kohlberg  and  Reny  [1997] ,  define  X  and  Y  to  be  weakly  independent  with  respect  to  v  if 
X  =  x  is  approximately  independent  of  Y  =  y  and  Y  =  y  is  approximately  independent 
of  X  =  x  with  respect  to  v  for  all  x  G  V(X)  and  y  G  V(X).8 

For  standard  probability  measures,  it  easily  follows  that  if  X  is  independent  of  Y, 
then  X  G  Lfi  is  independent  of  Y  G  Vj  conditional  on  Y  G  Vi  and  Y  G  Vj  is  independent 
of  X  G  U\  conditional  on  X  G  U2,  for  all  f/j,  Lh  C  V(A")  and  Vj ,  V2  C  V(V).  The  same 
arguments  show  that  this  is  also  true  for  for  nonstandard  probability  measures.  However, 
the  argument  breaks  down  for  approximate  independence. 

Example  6.4:  Suppose  that  W  =  {1,2,3}  x  {1,2}.  Let  X  and  Y  be  the  random 
variables  that  project  onto  the  first  and  second  components  of  a  world,  respectively,  so 
that  X(i,j )  =  i  and  Y(i,j)  =  j.  Let  v  be  the  nonstandard  probability  measure  on  W 
given  by  the  following  table: 


Y  =  1 

Y  =  2 

X  =  1 

1  -  3e  -  3e2 

e 

X  =  2 

e 

e2 

X  =  3 

e 

2e2 

It  is  easy  to  check  that  X  and  Y  are  weakly  independent  with  respect  to  u,  for  all 
i  G  {1,2,3},  j  G  {2,3}.  However,  st  (z/(X  =  2  |  X  G  {2,  3}  D  Y  —  2))  =  1/3,  while 
st  (u(X  =  2  |  X  G  {2, 3}))  =  1/2.  | 

In  light  of  this  example,  I  define  X  to  be  approximately  independent  of  {Yj, . . . ,  Yn } 
with  respect  to  v  if  X  G  Lj  is  approximately  independent  of  (Vj  G  Vj)  fl . . .  D  (Yn  G  Vn ) 
conditional  on  (Vj  G  Vj')  fl . . .  fl  (Yn  G  Vf)  with  respect  to  v  for  all  Ui  C  V(X),  Vj,  Vj'  C 
V(Vj),  and  i  =  1, . . . ,  n.  Xi, ... ,  Xn  are  approximately  independent  with  respect  to  v  if  Ah 

8Kohlberg  and  Reny’s  definition  of  weak  independence  also  requires  that  the  joint  range  of  X  and  Y 
be  the  product  of  the  individual  ranges.  That  is,  for  X  and  Y  to  be  weakly  independent,  it  must  be  the 
case  that  for  all  x  €  V(A)  and  y  G  V(F),  there  exists  some  w  £  W  such  that  X(w)  =  x  and  Y(w)  =  y. 
Of  course,  this  requirement  could  also  be  added  to  the  definition  I  am  proposing  here;  adding  it  would 
not  affect  any  of  the  results  of  this  paper. 
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is  approximately  independent  of  {Ad, . . . ,  An}  —  {A*}  with  respect  to  v  for  i  —  1^ . . ,  n.  1 
leave  to  the  reader  the  obvious  extensions  to  conditional  independence  and  the  analogues 
of  this  definition  for  Popper  spaces  and  LPS’s. 

As  I  said,  BBD  consider  three  notions  of  independence  for  random  variables.  One 
is  a  decision-theoretic  notion  of  stochastic  independence  on  preference  relations  on  acts 
over  W.  Under  appropriate  assumptions,  it  can  be  shown  that  a  preference  relation  is 
stochastically  independent  iff  it  can  be  represented  by  some  (real-valued)  utility  function 
u  and  a  nonstandard  probability  measure  v  such  that  ... ,  Xn  are  approximately 
independent  with  respect  to  v  [Battigalli  and  Veronesi  1996].  A  second  notion  they 
consider  is  a  weak  notion  of  product  measure  that  requires  only  that  there  exist  measures 
zq , . . . ,  vn  such  that  st(v{w  i,...,wn))  =  st(v  i(wi)  x  ■■■v{wn)).  As  we  have  already 
observed,  this  notion  of  independence  is  rather  weak.  Indeed,  an  example  in  BBD  shows 
that  it  misses  out  on  some  interesting  decision-theoretic  behavior. 

The  third  notion  of  independence  that  BBD  consider  is  the  strongest.  BBD  [1991b] 
define  Ad, . . . ,  Xn  to  be  strongly  independent  with  respect  to  an  LPS  jl  if  they  are  in¬ 
dependent  (in  the  usual  sense)  with  respect  to  an  NPS  v  such  that  //  fv  u.9  Moreover, 
they  give  a  characterization  of  this  notion  of  strong  independence,  which  1  henceforth  call 
BBD  strong  independence ,  to  distinguish  it  from  the  KR  notion  of  strong  independence 
that  1  discuss  shortly.  Given  a  tuple  f—  (r°, . . . ,  rk~1)  of  vectors  of  reals  in  (0,  l)fc  and  a 
finite  LPS  jl  =  (/A, . . . ,  pk),  let  jl  □  r  be  the  (standard)  probability  measure 

(1  -  r°)/i°  +  r°[(l  -  r  V  +  rx[(l  -  r2)/i2  +  r2[-  •  •  +  rk~2[(  1  -  r*"1)//"1  +  •  •  •]]]• 

Note  that  jl  □  r  is  defined  only  if  jl  is  finite.  Thus,  in  discussing  BBD  strong  independence, 
1  restrict  to  finite  LPS’s.  In  addition,  for  technical  reasons  that  will  become  clear  in  the 
proof  of  Theorem  6.5,  I  consider  only  random  variables  with  finite  range,  which  is  what 
BBD  do  as  well.  BBD  [1991b,  p.  90]  claim  without  proof  that  “it  is  straightforward  to 
show”  that  A i , . . . ,  Xn  are  BBD  strongly  independent  with  respect  to  jl  iff  there  is  a 

sequence  r3,  j  =  1,  2, . . .  of  vectors  in  (0,  l)fc  such  that  r3  — >  (0, . . . ,  0)  as  j  — >  oo,  and 

A i , .  . . ,  Xn  are  independent  with  respect  to  /!□  r3  for  j  =  1,  2,  3, . . ..  I  can  prove  this 
result  only  if  the  NPS  v  such  that  jl  ~  u  and  AR, . . . ,  Xn  are  independent  with  respect 
to  v  has  a  range  that  is  an  elementary  extension  of  the  reals  (and  thus  has  the  same 
first-order  properties  as  the  reals). 

Theorem  6.5:  There  exists  an  NPS  v  whose  range  is  an  elementary  extension  of  the 
reals  such  that  jl  m  v  and  X1, . . . ,  Xn  are  independent  with  respect  to  v  iff  there  exists  a 

sequence  f3 ,  j  =  1,  2, . . .  of  vectors  in  (0,  l)fc  such  that  r3  — >  (0, . . . ,  0)  as  j  — >  oo,  and 

Xi, . . . ,  Xn  are  independent  with  respect  to  jl  □  r3  for  j  =  1, 2,  3, . . .. 

9In  [Blume,  Brandenburger,  and  Dekel  1991b],  BBD  say  that  this  definition  of  strong  independence 
is  given  in  [Blume,  Brandenburger,  and  Dekel  1991a].  However,  the  definition  appears  to  be  given  only 
in  terms  of  NPS’s  in  [Blume,  Brandenburger,  and  Dekel  1991a]. 
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I  do  not  know  if  this  result  holds  without  requiring  that  v  be  an  elementary  extension  of 
the  reals. 

Kohlberg  and  Reny  [1997]  define  a  notion  of  strong  independence  with  respect  to  what 
they  call  relative  probability  spaces,  which  are  closely  related  to  Popper  spaces  of  the  form 
(W,  2W ,  2W  —  {0},  p) ,  where  all  subsets  of  W  are  measurable  and  it  is  possible  to  condition 
on  all  nonempty  sets.  Their  definition  is  similar  in  spirit  to  the  characterization  of  BBD 
strong  independence  given  in  Theorem  6.5.  For  ease  of  exposition,  1  recast  their  definition 
in  terms  of  Popper  spaces.  AR, . . .  ,Xn  are  KR-strongly  independent  with  respect  to  the 
Popper  space  (W,  T ,  T' ,  p),  where  T'  includes  all  events  of  the  form  Xt  =  x  for  x  G  V(Aj), 
if  there  exist  a  sequence  of  standard  probability  measures  pi,  p2,  ■  ■  ■  such  that  p3  — >  p, 
and  for  all  j  =  1,  2,  3, . . .,  Pj(U )  >  0  for  U  €  T'  and  Ad, ... ,  Xn  are  independent  with 
respect  to  p3 .  As  Kohlberg  and  Reny  show,  KR.-strong  independence  implies  approximate 
independence10  and  is,  in  general,  strictly  stronger. 

The  following  theorem  characterizes  KR  strong  independence  in  terms  of  NPS’s. 

Theorem  6.6:  AR, . . . ,  Xn  are  KR-strongly  independent  with  respect  to  the  Popper  space 
{W,T,T' ,p)  iff  there  exists  an  NPS  fW,T,v)  such  that  Fn^p(W,  T ,v)  =  (W,  T,  T' ,  p) 
and  X\y . . . ,  Xn  are  independent  with  respect  to  (W,  T ,  v). 

It  follows  from  the  proof  that  we  can  require  the  range  of  v  to  be  a  nonelementary 
extension  of  the  reals,  but  this  is  not  necessary. 

Kohlberg  and  Reny  show  that  their  notions  of  weak  and  strong  independence  can  be 
used  to  characterize  Kreps  and  Wilson’s  [1982]  notion  of  sequential  equilibrium.  BBD 
[1991b]  use  their  notion  of  strong  independence  in  their  characterization  of  perfect  equi¬ 
librium  and  proper  equilibrium  for  games  with  more  than  two  players.  Finally,  Battigali 
[Battigalli  1996]  uses  approximate  independence  (or,  equivalently,  independence  in  cps’s) 
to  characterize  sequential  equilibrium. 


7  Discussion 

As  the  preceding  discussion  shows,  there  is  a  sense  in  which  NPS’s  are  more  general 
than  both  Popper  spaces  and  LPS’s.  It  would  be  of  interest  to  get  a  natural  charac¬ 
terization  of  those  NPS’s  that  are  equivalent  to  Popper  spaces  and  LPS’s;  this  remains 
an  open  problem.  LPS’s  are  more  expressive  than  Popper  measures  in  finite  spaces  and 
in  infinite  spaces  where  we  assume  countable  additivity  (in  the  sense  discussed  at  the 
end  of  Section  5),  but  without  assuming  countable  additivity,  they  are  incomparable,  as 
Examples  3.3  and  3.4  show.  Since  all  of  these  approaches  to  representing  uncertainty 
have  been  using  in  characterizing  solution  concepts  in  extensive-form  games  and  notions 

10They  actually  show  only  that  it  implies  weak  independence,  but  the  same  argument  shows  that  it 
implies  approximate  independence. 


of  admissibility,  the  results  here  suggest  that  it  is  worth  considering  the  extent  to  which 
these  results  depend  on  the  particular  representation  used. 

It  is  worth  stressing  here  that  this  notion  of  equivalence  depends  on  the  fact  that 
I  have  been  viewing  cps’s,  LPS’s,  and  NPS’s  as  representations  of  uncertainty.  But, 
as  Asheim  [2006]  emphasizes,  they  can  also  be  viewed  as  representations  of  conditional 
preferences.  Example  5.3  shows  that,  even  in  finite  spaces,  NPS’s  and  LPS’s  can  express 
preferences  that  cps’s  cannot.  However,  as  Asheim  and  Perea  [2005]  point  out,  in  finite 
spaces,  cps’s  can  also  represent  conditional  preferences  that  cannot  be  represented  by 
LPS’s  and  NPS’s.  See  [Asheim  2006]  for  a  detailed  discussion  of  the  expressive  power  of 
these  representations  with  respect  to  conditional  preferences. 

Although  NPS’s  are  the  most  expressive  of  the  three  approaches  I  have  considered, 
they  have  some  disadvantages.  In  particular,  working  with  a  nonstandard  probability 
measure  requires  defining  and  working  with  a  non- Archimedean  field.  LPS’s  have  the 
advantage  of  using  just  standard  probability  measures.  Moreover,  their  lexicographic 
structure  may  give  useful  insights.  It  seems  to  be  worth  considering  the  extent  to  which 
LPS’s  can  be  generalized  so  as  to  increase  their  expressive  power.  In  particular,  it  may 
be  of  interest  to  consider  LPS’s  indexed  by  partially  ordered  and  not  necessarily  well- 
founded  sets,  rather  than  just  LPS’s  indexed  by  the  ordinals.  For  example,  Branden- 
burger,  Friedenberg,  and  Keisler  [2008]  characterize  n  rounds  of  iterated  deletion  using 
finite  LPS’s,  for  any  n.  Rather  than  using  a  sequence  of  (finite)  LPS’s  of  different  lengths 
to  characterize  (unbounded)  iterated  deletion,  it  seems  that  a  result  similar  in  spirit  can 
be  obtained  using  a  single  LPS  indexed  by  the  (positive  and  negative)  integers. 

I  conclude  with  a  brief  discussion  of  a  few  other  issues  raised  by  this  paper. 

•  Belief:  The  connections  between  LPS’s,  NPS’s,  and  cps’s  are  relevant  to  the  notion 
of  belief.  There  are  two  standard  notions  of  belief  that  can  be  defined  in  LPS’s.  Say 
that  U  is  a  certain  belief  in  LPS  jl  of  length  a  if  Hp(U)  =  1  for  all  /3  <  or,  U  is  weakly 
believed  if  po(U)  =  1.  Brandenbnrger,  Friedenberg,  and  Keisler  [2008]  defined  a 
third  notion  of  belief,  intermediate  between  weak  and  strong  belief,  and  provided  an 
elegant  decision-theoretic  justification  of  it.  According  to  their  definition,  an  agent 
assumes  U  in  jl  if  there  is  some  (3  <  a  such  that  (a)  /ipfU)  =  1  for  all  ff  <  /3, 

(b)  /up'fjU)  =  0  for  all  f3"  >  f3,  and  (c)  U  C  Ug^gSuppjpgf),  where  Suppjppi) 

denotes  the  support  of  the  probability  measure  py.  (Condition  (c)  is  unnecessary 
if  W  is  finite,  given  Brandenburger,  Friedenberg,  and  Keisler’s  assumption  that 
W  =  U ySupp^ppi).)  There  are  straightforward  analogues  of  certain  belief  and  weak 
belief  in  Popper  spaces.  U  is  strongly  believed  in  a  Popper  space  ( IT,  JF,  p)  if 

p(U  |  V)  —  1  for  all  V  G  T' \  U  is  weakly  believed  if  p(U  \  V)  —  1  for  all  V  G  T' 

such  that  p(V)  >  0.  Analogues  of  this  notion  of  assumption  have  been  considered 
elsewhere  in  the  literature.  Van  Fraassen  [1995]  independently  defined  a  notion  of 
belief  using  Popper  spaces;  in  a  finite  state  space,  an  event  is  what  van  Fraassen 
calls  a  belief  core  iff  it  is  assumed  in  the  sense  of  Brandenburger,  Friedenberg,  and 
Keisler.  Battigalli  and  Siniscalchi’s  [2002]  notion  of  strong  belief  is  also  essentially 
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equivalent.  Assumption  also  corresponds  to  Stalnaker’s  [1998]  notion  of  absolutely 
robust  belief  and  Asheim  and  Spvik’s  [2005]  notion  of  robust  belief.  Asheirn  and 
Spvik  [2005]  do  a  careful  comparison  of  all  these  notions  (and  others). 

It  is  easy  to  define  analogues  of  certain  and  weak  belief  in  NPS’s:  U  is  certain  belief 
if  v(U)  —  1;  U  is  weakly  believed  if  st[y{U ))  =  1.  The  results  of  this  paper  suggest 
that  it  may  also  be  worth  investigating  an  analogue  of  assumption  in  NPS’s. 

•  Nonstandard  utility:  In  this  paper,  while  I  have  allowed  probabilities  to  be  lexi¬ 
cographically  ordered  or  nonstandard,  I  have  implicitly  assumed  that  utilities  are 
standard  real  numbers  (since  I  have  restricted  to  real- valued  random  variables). 
There  is  a  tradition  in  decision  theory  going  back  to  Hausner  [1954]  and  continued 
recently  in  a  sequence  of  papers  by  Fishburn  and  Lavalle  (see  [Fishburn  and  Lavalle 
1998]  and  the  references  therein)  and  Hammond  [1999]  of  considering  nonstandard 
or  lexicographically-ordered  utilities.  I  have  not  considered  the  relationship  be¬ 
tween  these  ideas  and  the  ones  considered  here,  but  there  may  be  some  fruitful 
connections. 

•  Countable  additivity  for  NPS’s:  Countable  additivity  for  standard  probability  mea¬ 
sures  is  essentially  a  continuity  condition.  The  fact  that  a*  may  not  be  the 
least  upper  bound  of  the  partial  sums  Yfi=\  a*  in  an  NTS  leads  to  a  certain  lack 
of  continuity  in  decision-making.  For  example,  let  W  =  {wi,w2,  ■  ■ .}.  Consider  a 
nonstandard  probability  measure  v  such  that  z/(wi)  =  1/3  —  e,  u(w2)  =  1/3  +  e, 
and  v{wk+ 2)  =  1/(3  x  2fc),  for  k  =  1,2,....  Let  Un  =  {w3,...,w„}  and  let 
^  =  {w3,uq, ...}.  Clearly  v(Un)  — >  u( U, ^  =  1/3.  However,  u(Un)  <  v{wf) 
for  all  n.  Thus,  Eu(x{Wl})  >  K(xun)  for  all  n>  3  although  Ev(x{Wl})  <  Kixu^)- 

Not  surprisingly,  the  same  situations  can  be  modeled  with  LPS’s.  Consider  the  LPS 
where  Hi  =  st(v),  n2(w  1)  =  0,  fJ,2(w2)  =  2/3,  and  n2(wk+2 )  =  1/(3  x  2k) 
for  k  =  1,  2, . . ..  It  is  easy  to  see  that  again  Ey(x{Wl})  >  Ep(xu„)  f°r  all  n  >  3 
although  Efi(x{wi})  <  (A  similar  example  can  be  obtained  using  SLPS’s, 

by  replacing  each  world  vjt  by  a  pair  of  worlds  where  w[  is  in  the  support 

of  fi\  and  w"  is  in  the  support  of  /x2.) 

An  analogous  continuity  problem  arises  even  in  finite  domains.  Let  W  =  {wi,w2,  W3} 
and  consider  a  sequence  of  probability  measures  un  such  that  vn{wf)  =  1/3  —  1/n, 
un(w2)  =  1/3  —  e  and  v{w 3)  =  1/3  +  1/n  +  e.  Clearly  un  — >  u,  where  v(wf)  =  1/3, 
v{w2)  =  1/3  -  e,  and  u(w3)  =  1/3  +  e.  However,  un(x{Wl})  <  vn(X{w2})  f°r  all  n, 
while  ^(X{W1})  >  v(x{w2})-  Again,  the  same  situation  can  be  modeled  using  LPS’s 
(and  even  SLPS’s). 

Of  course,  continuity  plays  a  significant  role  in  standard  axiomatizations  of  SEU, 
and  is  vital  in  proving  the  existence  of  a  Nash  equilibrium.  None  of  the  uses  of 
continuity  that  I  am  familiar  with  have  the  specific  form  of  this  example,  but  I 
believe  it  is  worth  considering  further  the  impact  of  this  lack  of  continuity. 
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A  Appendix:  Proofs 

In  this  section,  I  prove  all  the  results  claimed  in  the  main  part  of  the  paper.  For  the 
convenience  of  the  reader,  I  repeat  the  statements  of  the  results. 

Theorem  3.1:  IfW  is  finite  and  (F,  F'),  then  Fs^p  is  a  bijection  from  SLPS(H/,  F.  F') 
toPop(W,F,F'). 

Proof:  The  first  step  is  to  show  that  Fs^p  is  an  injection.  If  jl,  jl'  G  SLPS(W,  F ,  Fr)  and 
fi  fi  jl' ,  let  fi  =  Fs^pi}V.  F,  jl),  and  let  /a'  =  Fs^p(fiV,F,  fi).  Let  i  be  the  least  index 
such  that  Hi  fi  Up  There  is  some  set  U  such  that  /4(H)  fi  ifi (U).  Let  Ut  be  the  set  such 
Hi(Uj)  =  1  and  Hj(Ui)  =  0  for  j  <  i\  since  jl  is  an  SLPS,  such  a  set  Ut  exists.  Similarly,  let 
Uj  be  such  that  fifiUfi  =  1  and  fifiUfi  =  0  for  j  <  i.  Since  fij  —  fij  for  all  j  <  h  we  must 
have  U  Uj)  =  U  U')  =  0  for  all  j  <  i.  Clearly  jUUj  U  f/j)  >  0,  so  Uj  U  t/j  G  F’ . 
Moreover,  fx(U  \Ui  U  Uj)  =  /afiU  |  U  Uj)  =  fifiU).  Similarly,  fj,'(U  \  U.t  U  Uj)  =  fifiU). 
Hence,  /a  fi  fi- 

To  show  that  Fs^p  is  a  surjection,  given  a  cps  yU,  let  jl  =  fio,  ■  ■  ■ ,  Hk)  be  the  LPS 
constructed  in  the  main  text.  We  must  show  that  Fs^p(jl)  =  (W,F,F',fi.  Suppose 
that  Fs^p(jl)  =  (W,F,F",fi).  I  first  show  that  F'  =  F" .  Suppose  that  V  G  F" .  Then 
HiiV)  >  0  for  some  i.  Thus,  yu(H  |  Ufi  >  0.  Since  f/*  G  F\  it  follows  that  V  G  F' .  Thus, 
F"  C  F'. 

To  show  that  F'  C  F" ,  first  note  that,  by  construction,  fiUj  \  U0  U  . . .  U  Uj- 1)  =  1. 
It  easily  follows  that  if  V  C  f/0  U  . . .  U  LL,_ i  then 

fiv  |  u0  u . . .  u  u~ i)  =  fj,{v  n  Uj  |  u0  u . . .  u  ufifi). 


Thus,  by  CP3, 


fiv  I  Ho  U  ...  U  Uj.fi  =  h(V  n  Uj  I  Uq  U  . . .  U  Uj.fi  =  fiv  I  Uj)  x  n(Uj  I  Ho  U  ...  U  Uj.fi, 


so 


fiV\Ufi  =  fiV\U0U... UUj.fi. 


(1) 
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Now  suppose  that  P  e  T' .  Clearly  P  n  (C0  U  . . .  U  Uk)  ^  0,  for  otherwise  P  C 
Uq  U  . . .  U  Uk,  contradicting  the  fact  that  Co  U  . . .  U  Uk  £  T' .  Let  jy  be  the  small¬ 
est  index  j  such  that  P  fl  Uj  ^  0.  I  claim  that  fj,(V  \  Uq  U  . . .  U  Ujv-\ )  ^  0.  For  if 
n(V  |  Uq  U  ...  U  Ujv- 1)  =  0,  then  n(Ujv—V  \  Uq  U  . . .  U  Ujv- 1)  =  1,  contradicting  the  defi¬ 
nition  of  Ujv  as  the  smallest  set  U'  such  that  n(U'  \  C0  U  . . .  U  Ujv- 1)  =  1.  Moreover,  since 
P  C  C0  U  . . .  Ujv- 1,  it  follows  from  (1)  that  /i(P  \  Ujv )  =  fi(V  \  C0  U  . . .  U  Ujv- 1)  >  0. 
Thus,  /j,jv(V)  >  0,  so  P  G  J~" . 

This  argument  can  be  extended  to  show  that  n(V'  \  V)  =  fi'iV'  \  V)  for  all  V  e 
T .  Since  V  fl  U  =  0  for  i  <  jy ,  it  follows  that  n'(V'  \V)  =  nn,  {V'  \  V).  By  CP3, 
fi(V'  |  V)  x  /i(V  |  Co  U  ...  U  Ujv- 1)  =  fl  V  |  Uq  U  . . .  U  Ujv- 1).  By  (1)  and  the  fact 
that  n(V  |  Ujv)  >  0,  it  follows  that  \  V )  =  n(V'  fl  V  \  Ujv)/fj,(V  \  Ujv ),  that  is,  that 
W\V)=Nv(V'\V).l 

Although  Theorem  3.5  was  proved  by  Spohn  [1986],  1  include  a  proof  here  as  well,  to 
make  the  paper  self-contained. 

Theorem  3.5:  For  all  W,  the  map  Fg-,p  is  a  bijection  from  SLPSc(hF,  T ,  T')  to 

Pop  c{yv,T,r). 

Proof:  Again,  the  difficulty  comes  in  showing  that  Fs-,p  is  onto.  As  it  says  in  the  main 
text,  given  a  Popper  space  (W,  T ,  T' ,  n),  the  idea  is  to  construct  sets  Co,  Ci, . . .  and  an 
LPS  fl  such  that  iap(V)  =  n(V  \  Cg),  and  show  that  Fs->p(W,  T,  ft)  =  (W,  T ,  T' ,  fi).  The 
construction  is  somewhat  involved. 

As  a  first  step,  put  an  order  <  on  sets  in  F'  by  defining  C  <Cif/x(C|CUC)>0. 
(Essentially,  the  same  order  is  considered  by  van  Fraassen  [1976].) 

Lemma  A.l:  <  is  transitive. 

Proof:  By  definition,  if  U  <V  and  V  <  V' ,  then  /i(U  |  CUP)  >0  and  fi(V  \  V UP')  >  0. 
To  see  that  p{U  \  U  U  V')  >  0,  note  that  /i(C  |  C  U  P U  P')  +  //(P  |  C  U  P  U  P')  +  p,(P'  |  C  U 
PUP')  =  1,  so  at  least  one  of  p,(C  |  C  U  P  U  P'),  g(P  |  CUPUP'),  or  /i(P' |  C  U  P  U  P') 
is  positive.  I  consider  each  of  the  cases  separately. 

Case  1:  Suppose  that  fi(U  \  U  U  P  U  V')  >  0.  By  CP3, 

]u(C|CUPUP,)  =  /i(C|CUP')x/i(CUP'|CUPU  P'). 

Thus,  yu(C  |  C  U  V')  >  0,  as  desired. 

Case  2:  Suppose  that  fi(V  \  U  U  P  U  V')  >  0.  By  assumption,  /i(U  \  U  U  P)  >  0;  since 
fi(V  |  C  U  P  U  V')  >  0,  it  follows  that  fi(U  U  P  |  C  U  P  U  V')  >  0.  Thus,  by  CP3, 

/i(C|CUPUP')=/i(C|CUP)x/i(CUP|CUPUP')>0. 

Thus,  case  2  can  be  reduced  to  case  1. 
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Case  3:  Suppose  that  \  U  U  V  U  V ')  >  0.  By  assumption,  fi(V  \  V  U  V')  >  0;  since 
n{V'  \UUVUV')>0,it  follows  that  fi(V  U  V'  |  U  U  V  U  V')  >  0.  Thus,  by  CP3, 

H(V  |  U  U  V  U  V’)  =  n(V  |  V  U  V')  x  /i{V  U  V'  |  U  U  V  U  V ')  >  0. 

Thus,  case  3  can  be  reduced  to  case  2. 

This  completes  the  proof,  showing  that  <  is  transitive.  | 

Define  U  ~  V  if  U  <  V  and  V  <  U. 

Lemma  A. 2:  ~  is  an  equivalence  relation  on  T' . 

Proof:  ft  is  immediate  from  the  definition  that  ~  is  reflexive  and  symmetric;  transitivity 
follows  from  the  transitivity  of  <.  | 

Renyi  [1956]  and  van  Fraassen  [1976]  also  considered  the  ~  relation  in  their  papers, 
and  the  argument  that  <  is  transitive  is  similar  in  spirit  to  Renyi’s  argument  that  ~  is 
transitive.  However,  the  rest  of  this  proof  diverges  from  those  of  Renyi  and  van  Fraassen. 

Let  [U]  denote  the  ^-equivalence  class  of  U,  and  let  jF'/~=  {[U]  :  U  G  T'}. 

Lemma  A. 3:  Each  equivalence  class  [V]  G  is  closed  under  countable  unions. 

Proof:  Suppose  that  Vi,V2,...  G  [V],  1  must  show  that  U G  [V],  Clearly  V3 
U ^Vi  for  all  j.  Suppose,  by  way  of  contradiction,  that  U^1Vi  %  Vj  for  some  j.  Since 
is  transitive,  it  follows  that  Vj  <  U ^V.  for  all  j.  Thus,  /i(Vj  \  U“1  V)  =  0  for  all  j.  But 
then,  by  countable  additivity, 

OO 

i  =  MU~,  v,  I  U“,  V)  <  y  f,(v  I  u“,  v)  =  0, 

1=1 

a  contradiction.  Thus,  [V]  is  closed  under  countable  unions.  I 
Fix  an  element  Vo  G  [V]. 

Lemma  A.4:  inf {^(P0  |  P0  U  V')  :  V'  G  [V]}  >  0. 

Proof:  Suppose  that  inf{/i(Vr0  |  V0  U  V')  :  V'  G  [V]}  =  0.  Then  there  exist  sets  Vf,  V2, . . . 
such  that  fi(V o  |  Vq U  Vn)  <  1/n.  Since  [ V }  is  closed  under  countable  unions,  U”=  1  Vt  G  [V]. 
Since  Vo  ~  U”=1  V];,  it  follows  that  >Lt(Vb  |  U?h0  Vi)  >  0.  But,  by  CP3, 

MR 3 1  U~o  Vi)  =  v(Vo  I  Ho  U  Vn)  X  n{V 3  U  Vn  \  U~0  Vj)  <  n{V b  |  Co  u  Vn)  <  1/n. 

Since  this  is  true  for  all  n  >  0,  it  follows  that  n(V0  \  U“0  Vi)  =  0,  a  contradiction.  | 

The  next  lemma  shows  that  each  equivalence  class  in  JP'/~  has  a  “maximal  element”. 
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Lemma  A. 5:  In  each  equivalence  class  [V],  there  is  an  element  V*  G  [V]  such  that 
n(V*  \V'UV*)  =  1  for  all  V’  G  [P] . 

Proof:  Again,  fix  an  element  Vo  G  [P].  By  Lemma  A. 4,  there  exists  some  ay  >  0  such 
that  inf  {/t(P0  |  Vo  U  V)  :  V  G  [P]}  =  ay.  Thus,  there  exist  sets  Pi,  p>,  V3, . . .  G  [P]  such 
that  fi(V o  |  Vo  U  Vn )  <  a  +  1  fn.  By  Lemma  A. 3,  V*  =  U °h0p  G  [V],  By  CP3, 

n{VQ  |  P*)  =  //(Vi o  |  P0  U  Vn)  x  //(P0  u  ^  I  v*)  <  I  U  Vn)  <  av  +  1  fn. 

Thus,  /t(V o  |  P*)  <  ay.  By  choice  of  ay,  it  follows  that  /t(V o  |  P*)  =  ay. 

Suppose  that  /t(P*  |  V'  U  V*)  <  1  for  some  V'  G  [P].  But  then,  by  CP3, 

/t(V o  |  P'  U  P*)  =  ^(Po  |  P*)  x  fl(V*  |  P'  U  P*)  <  av, 

contradicting  the  choice  of  ay.  Thus,  n(V*  \  V'  U  V*)  =  1  for  all  V'  G  [1 P ].  I 

Define  a  total  order  on  these  equivalence  relations  by  taking  [U]  <  [P]  if  U'  <  V' 
for  some  U'  G  [U]  and  V'  G  [P].  ft  is  easy  to  check  (using  the  transitivity  of  <)  that  if 
U'  <  V'  for  some  U'  G  [U]  and  some  V  G  [V],  then  U"  <  V"  for  all  U"  G  [U]  and  all 
V"  G  [P]. 

Lemma  A. 6:  <  is  a  well-founded  relation  on 

Proof:  Note  that  if  [U]  <  [V],  then  /t(P  |  U  U  P)  =  0.  ft  now  follows  from  countable 
additivity  that  <  is  a  well-founded  order  on  these  equivalence  classes.  For  suppose  that 
there  exists  an  infinite  decreasing  sequence  [Uq]  >  [Ui]  >  [Uf  >  ....  Since  T  is  a 
cr-algebra,  U°Z0Ui  G  T\  since  T]  is  closed  under  supersets,  UfL0Ut  G  T' .  By  CP3, 

h(Uj  |  u“0  Ui)  =  n{Uj  I  Uj  U  Uj+l)  x  fi(Uj  U  Uj+1  I  U~0  Ui)  =  0. 

Let  P0  =  Uq  and,  for  j  >  0,  let  Vj  =  Uj  —  Clearly  the  V,-’s  are  pairwise  disjoint, 

U iUi  =  U,;p,  and  /i(Vj- 1  U°Z0Uj)  <  /i ( U3  \  U°Z0Uj)  =  0.  It  now  follows  that  using  countable 
additivity  that 

OO 

1  =  x(u I  u“„ Ui)  =  T.MVi\  U“0 Ui)  =  o. 

i= 0 

This  is  as  contradiction,  so  the  equivalence  classes  are  well-founded.  I 

Because  <  is  well-founded,  there  is  an  order-preserving  Injection  O  from  JP'/~  to  an 
initial  segment  of  the  ordinals  (i.e.,  [U]  <  [V]  iff  0([U])  <  0([P]).  Thus,  the  equivalence 
classes  can  be  enumerated  using  all  the  ordinals  less  than  some  ordinal  a.  By  Lemma  A. 5, 
there  are  sets  Up,  (3  <  a,  in  IF'  such  that  if  0([U])  =  (3,  then  Up  G  [U]  and  i-i{Up  \  UUUp)  = 
1  for  all  U'  G  [U\.  Define  an  LPS  jl  =  (/to,  /q,  •  •  •)  of  length  a  by  taking  Hp(V)  =  /t(P  |  Up). 
The  choice  of  the  Up’s  guarantees  that  this  is  actually  an  SLPS. 
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It  remains  to  show  that  (W,  F,  F' ,  p)  is  the  result  of  applying  F$^p  to  (W,  F,  p). 
Suppose  that  instead  ( W,  F ,  F" .  p')  is  the  result.  The  argument  that  F"  C  F'  is  identical 
to  that  in  the  finite  case:  If  V  G  F" ,  then  pp(V)  >  0  for  some  (3.  Thus,  p(V  \  Up)  >  0. 
Since  Up  G  F1,  it  follows  that  V  G  F' .  Thus,  F"  C  F' . 

Now  suppose  that  V  G  F' .  Thus,  V  ~  Vp  for  some  (3  <  a.  It  follows  that  p(V  j  Vp)  > 
0,  so  V  G  F" . 

Finally,  to  show  that  p{U  \  V )  =  p'{U  \  V ),  suppose  that  (3  is  such  that  V  ~  Vp.  It 
follows  that  p(V  |  Vp: )  =  0  for  (3'  <  (3  and  p(V  \  Vp)  >  0.  Thus,  by  definition,  p'(U  \  V)  = 
pp(U  |  V).  Without  loss  of  generality,  assume  that  U  C  V  (otherwise  replace  U  by  U fll/). 
Thus,  by  CP3, 

p(U\V)  x  p(V\VUVp)  =  p(U\VUVp).  (2) 

Suppose  V'  C  V.  Clearly 

n{v'  \vuvp)  =  p{v'  nVp\vuVp)  +  p{v'  nVp\W3Vp). 

Now  by  CP3  and  the  fact  that  p(Vp  \  V  U  Vp)  =  1, 

p(v' nvp\vuvp)  =  n{Y | Vp)  x p{Vp \vuvp)  =  p(v' \ v0) 

and 

p{v'  nVp\vuVp)<  p{Vp  \vuVp)  =  o. 

Thus,  p{V'  |  V U  Vp)  =  piV'  |  Vp).  Applying  this  observation  to  both  U  and  V  shows  that 
p(V  |  V  U  Vp)  =  p{V  |  Vp)  and  p(U  |  V  U  Vp)  =  p(U  |  Vp).  Plugging  this  into  (2),  it  follows 
that 


Mt/  |  V)  =  M(U  I  vp/M(v  |  vp  =  ^(U)/^(V)  =  !1„(U  \V)  =  pt(U  |V). 

This  completes  the  proof  of  the  theorem.  | 

Proposition  3.9:  The  map  Fs^p  is  a  surjection  from  SLPSC(IF,  F,  T')  onto  TC(W,  T ,  T') . 

Proof:  Suppose  that  /i  G  TC(W,  F .  F').  I  want  to  construct  an  SLPS  p  G  SLPSC(W,  F,  F') 
such  that  Fs^p(p)  =  p.  I  first  label  each  element  of  F'  with  a  natural  number.  Intu¬ 
itively,  if  U  G  F'  is  labeled  k,  then  k  will  be  the  least  index  such  that  pk(U)  >  0.  The 
labeling  is  done  by  induction  on  k.  Each  topmost  set  in  the  forest  (i.e.,  the  root  of  some 
tree  in  the  forest)  is  labeled  0,  as  are  all  sets  U'  such  that  p{U'  \  U)  >  0,  where  U  is  a 
topmost  node.  These  are  all  the  nodes  labeled  by  0.  Label  all  the  maximal  unlabeled 
sets  by  1  (that  is,  label  U  G  F'  by  1  if  it  is  not  labeled  0,  and  is  not  a  subset  of  another 
unlabeled  set);  in  addition,  label  a  set  U'  by  1  if  p(U'  \  U)  >  0  and  U  is  labeled  by  1. 
Note  that  every  set  at  depth  0  or  1  in  the  forest  is  labeled  by  either  0  or  1. 

Suppose  that  the  labeling  process  has  been  completed  for  labels  0, . . . ,  k  such  that 
the  following  properties  hold,  where  label (U)  denotes  the  label  of  the  event  U\ 
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•  all  sets  up  to  depth  k  in  the  forest  have  been  labeled; 

•  if  label (U)  =  k! ,  U'  G  IF’ ,  and  p{U'  \  U)  >  0,  then  labeljU')  <  labeljU). 

Label  all  the  maximal  unlabeled  sets  with  k  +  1;  in  addition,  if  U'  is  unlabeled  and 
p(U'  |  U)  >  0  for  some  U  such  that  label{U)  —  k  +  1,  then  assign  label  k  +  1  to  U' . 
Clearly  the  two  properties  above  continue  to  hold.  This  completes  the  labeling  process. 

Let  Ck  be  the  set  of  maximal  sets  in  T'  labeled  k.  T2  and  T3  guarantee  that,  for  all  k, 
the  sets  in  Ck  are  disjoint.  Let  p'k  be  an  arbitrary  probability  on  W  such  that  p'k{U)  >  0 
for  all  U  G  Cu  and  J2u&ck  Fk(U)  =  1-  Define  an  LPS  jl  =  (po,  Pi,  -  •  ■)  as  follows  (where  the 
length  of  jl  is  to  if  Ck  ^  0  for  all  k,  and  is  k  + 1  if  k  is  the  largest  integer  such  that  Ck  ^  0). 
For  fef,  let  Hj{V)  =  Euec,  f*(V  I  Ufp'^U).  I  now  show  that  jl(V  \  U )  =  p{V  \  U)  for 
all  V  G  T  and  U  G  T' .  Suppose  that  U  G  Ck-  Then  Pj(U)  =  0  for  all  j  <  k,  and 
Hk(.U)  >  0.  Thus,  jl(y  |  U)  =  Hk(y\U).  But  it  is  immediate  from  the  definition  that 
Hkiy  |  U)  =  /i(V  |  U ).  Thus,  Fs^p(fi)  =  /U.  Moreover,  if  U  G  F'  and  label(U )  =  k,  let  U' 
be  the  maximal  set  containing  U  such  that  label {U')  =  k.  (The  labeling  guarantees  that 
such  a  set  exists.)  Then  /ikiU')  =  /i{U'  \  U)  >  0.  It  follows  that  jl{U)  >  0  for  all  u  G  T' . 
Finally,  note  that  jl  is  an  SLPS  (in  fact,  an  LCPS).  If  14  =  Uf4  —  U/c/>fc(UCfc/),  then  the 
sets  Uk  are  disjoint,  and  Hk(Uk)  =  1.  I 

Proposition  4.2:  If  u  ~  jl,  then  z/(C)  >  0  iff  jl{U )  >  0.  Moreover,  if  v{U)  >  0, 
then  st  ( v(V  \  U ))  =  V  \  U),  where  Hj  is  the  first  probability  measure  in  jl  such  that 

p-j{U)  >  0. 


Proof:  Recall  that  for  U  C  W,  Xu  is  the  indicator  function  for  U ;  that  is,  Xu(w)  =  1 
if  w  G  U  and  Xu(w)  =  0  otherwise.  Notice  that  Eujxu)  >  Evix®)  iff  ]y(U)  >  0  and 
Ey(xu)  >  Epfxd)  iff  jl(U)  >  0.  Since  v  ~  jl,  it  follows  that  o(U)  >  0  iff  jl(U)  > 
0.  If  o(U)  >  0,  note  that  Eu(xunv  —  rXu)  >  Eu(x<t)  iff  r  <  st(y(V  \  U)).  Similarly, 
Eyjxunv  ~  rXu )  >  E^jxd)  iff  r  <  hj(U),  where  j  is  the  least  index  such  that  Hj(U)  >  0. 
It  follows  that  st[y(V  \  U ))  =  Pj(V  \  U ).  I 

Proposition  4.3:  If  jl,  jl'  G  SLPS(IF,  IF),  then  jl,  ~  jl'  iff  jl  =  jl' . 

Proof:  Clearly  jl  =  jl'  implies  that  jl  ~  jl' .  For  the  converse,  suppose  that  jl,  ~  jl!  for 
jl,  jl'  G  SLPS {W,  IF).  If  jl,  yi  jl' ,  let  a  be  the  least  ordinal  such  that  pa  /i'a,  and  let  U  be 
such  that  pt,a{U)  ^  v'a(U).  Without  loss  of  generality,  suppose  that  pLa{U )  >  /J,'a(U).  Let 
the  sets  Up  be  such  that  fJ,p{Up)  =  1  and  =  0  if  7  >  /3;  similarly  choose  the  sets 

Up.  Since  pp  =  p'p  for  (3  <  a,  it  follows  that  pp{Ua  U  U'c J  =  p'p(Ua  U  U'a)  =  0  for  f3  <  a; 
moreover,  pa{Ua  U  U'a)  =  p'a(Ua  U  U'a)  =  1.  Choose  r  such  that  pa{U)  >  r  >  p'a(U).  Let 
X  be  the  random  variable  Xu  ~  rXuauu’a  and  let  Y  =  Then  EpfX)  >  Ep(Y),  while 
Ep{ X)  <  Ep,(Y),  so  jl  jl'.  I 
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Proposition  4.4:  If  W  is  finite,  then  every  LPS  over  {XV,  IF)  is  equivalent  to  an  LPS 
of  length  at  most  \BasicfIF)\. 

Proof:  Suppose  that  W  is  finite  and  BasicflF)  =  {U\, . . . ,  Uk}.  Given  an  LPS  fi,  define 
a  finite  subsequence  fi'  =  {/ik0}  ■  *  • ,  Pkh)  of  jl  as  follows.  Let  pko  =  jio-  Suppose  that 
Pk0 , . . . ,  pk  have  been  defined.  If  all  probability  measures  in  jl  with  index  greater  that 
kj  are  linear  combinations  of  the  probability  measures  with  index  pko , . . . ,  pkj,  then  take 
jl'  =  (/ifc0, . . .  ,Hk)-  Otherwise,  let  pk+1  be  the  probability  measure  in  jl  with  least  index 
that  is  not  a  linear  combination  of  pko, . . . ,  pkj  ■  Since  a  probability  measure  over  {XV,  IF) 
is  determined  by  its  value  on  the  sets  in  BasicjlF ),  a  probability  measure  over  {XV,  IF) 
can  be  identified  with  a  vector  in  jpj\Bas‘lc(:F)\:  the  vector  defining  the  probabilities  of 
the  elements  in  Basic{IF).  There  can  be  at  most  \Basic{IF)\  linearly  independent  such 
vectors,  thus  jl'  has  length  at  most  \Basic(!F)\. 

ft  remains  to  show  that  jl'  is  equivalent  to  jl.  Given  random  variables  A"  and  Y , 
suppose  that  EpfX)  <  Ep(Y).  Then  there  is  some  minimal  index  (3  such  that  Et,  (A)  = 
Eh(Y)  for  all  7  <  (3  and  Etl/fiX)  <  Ellj3(Y).  It  follows  that  up  cannot  be  a  linear 
combination  of  /x7  for  7  <  (3.  Thus,  up  is  one  of  the  probability  measures  in  jl' .  Moreover, 
the  expected  value  of  X  and  Y  agree  for  all  probability  measures  in  jl'  with  lower  index 
(since  they  do  in  jl).  Thus,  EpfiX)  <  EpfiX). 

The  argument  in  the  other  direction  is  similar  in  spirit  and  left  to  the  reader.  | 

Theorem  4.5:  If  XV  is  finite,  thenFL^N  is  a  bijection  from  LPS(XV,  IF) /&  to  NPS(hL,  tF)/m 
that  preserves  equivalence  (that  is,  each  NPS  in  Fk_,N([ji\)  is  equivalent  to  jl). 

Proof:  I  first  provide  a  sufficient  condition  for  an  NPS  to  be  equivalent  an  LPS  in  a 
finite  space. 

Lemma  A. 7:  Suppose  that  jl  =  {p0, . . . ,  pk),  and  eo, . . . ,  ek  are  such  that  st  (ej+i/ej)  =  0 
for  i  —  1, . . ... k  —  1  and  Y)i= 0  ei  —  1-  Then  jl  ~  eopo  +  •  •  •  +  ekpk.n 

Proof:  Suppose  that  there  exist  e, . . . ,  ek  as  in  the  statement  of  the  lemma  and  v  = 
eoho  +  •  •  •  +  efc/ifc.  I  want  to  show  that  /i  ~  v. 

If  Ep{ X)  <  Ep{Y),  then  there  exists  some  j  <  k  such  that  EH{ X)  <  EIJ;i  (Y) 
and  Elt.,(X)  =  E^fiY)  for  all  f  <  j.  Since  Ev{ X)  =  £*L0 ^{X)  and  EV{Y)  = 
J2i=: oVEfj.ii'Y),  to  show  that  EV(X)  <  EV{Y),  it  suffices  to  show  that  ej{E^.{Y)  — 
Eh( X))  >  YIi=j+iei(Eui(X)  ~  Em(Y)).  Since  ef+1  <  ef  for  j'  >  j  (this  follows 
from  the  fact  that  st  (cj'+i /ej')  =  0),  it  follows  that  Y,i=j+1ei(EIH(X)  -  E^fiY))  < 
G+i  T,i=j+i  \Em(X)  -  E^fiY) |.  Thus,  it  suffices  to  show  that  ej+1  T,i=j+i  \E^(X)  - 

11Although  I  do  not  need  this  fact  here,  it  is  easy  to  see  that  if  W  is  finite  and  jl  =  (y0,  •  •  • ,  Pk)  is  an 
SLPS  in  LPS(W,F),  then  the  converse  of  Lemma  A. 7  holds  as  well:  if  v  «  jl,  then  v  =  e^po  +  ■  ■  ■  ekpk 
for  some  eo,  •  •  • ,  pk  are  such  that  st(ej+i/ej)  =  0  for  i  =  1, . . . ,  k  —  1  and  Y)i=o  e *  =  1-  (I  conjecture  this 
fact  is  true  in  general,  not  just  if  jl  is  an  SLPS,  but  I  have  not  checked  this. 
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Em{Y) |  <  ej{E,Jt.{Y)  -  Eh( X)).  This  is  trivially  the  case  if  E^.{ X)  =  E^{Y)  for  all  i 
such  that  j  +  1  <  i  <  k.  Thus,  assume  without  loss  of  generality  that  J2i=j+i  I E„,(X)  - 
E^iY)  |  >  0.  In  this  case,  it  suffices  to  show  that  ej+i/ej  <  (E^(Y)— E^.(X))/  Y)l=j+i  I E,* 
Em{Y)  j.  Since  the  right-hand  side  of  the  inequality  is  a  positive  real  and  st  (ej+i/ej)  =  0, 
the  result  follows. 

The  argument  in  the  opposite  direction  is  similar.  Suppose  that  E„( X)  <  EU(Y). 
Again,  since  EU(X)  =  J2i=oeiE^i(X)  and  EU[Y)  =  fE^ (P),  it  must  be  the  case 
that  if  j  is  the  least  index  such  that  E^.(X)  ^  E^.iY),  then  E^.{X)  <  E^.fY).  Thus, 
Efi{ X)  <  Ep(Y).  It  follows  that  fl  ~  v.  | 

It  remains  to  show  that,  given  an  NPS  (W,  T ,  is),  there  is  an  equivalence  class  [jl] 
such  that  Fl^n([P])  =  H-  As  I  said  in  the  main  text,  the  goal  now  is  to  find  (standard) 
probability  measures  /io , . . . , Hk  and  eo, ...  ,6k  such  that  si (cj+i/ej)  =  0  and  v  =  eo/io  + 

•  •  •  +  efc/Xfc.  If  this  can  be  done  then,  by  Lemma  A. 7,  v  ~  (/x0, . . . ,  fj,k),  and  we  are  done. 

Suppose  that  Basic(E)  =  {Ui, . . . ,  U^}  and  that  v  has  range  1R* .  Note  that  a  prob¬ 
ability  measure  //  on  T  can  be  identified  with  a  vector  (ai,...,Ofc)  over  JR*,  where 
v'(Ui )  =  a,,  so  that  cq  +  •  •  •  +  Ofc  =  1.  In  the  rest  of  this  proof,  I  frequently  identify  v 
with  such  a  vector. 

Lemma  A. 8:  There  exist  k'  <  k,  eo,  where  eo  =  1,  st(ej+i/e*)  =  0  for  i  = 

1, . . . ,  k'  —  1,  and  standard  real-valued  vectors  bj,  j  =  0, . . . ,  k' ,  in  lRk  such  that 

y 

v  =  Y,eA- 

3=0 

Proof:  I  show  by  induction  on  m  <  k  that  there  exist  eo, ...  ,em  and  m!  <  m  such 
that  6j  =  0  for  j'  >  m',  st(ei+i/ei)  =  0  for  i  =  1, ...  ,  m'  —  1,  and  standard  vectors  bj 
j  =  0, . . . ,  m  —  1  and  a  possibly  nonstandard  vector  b'm  =  (b'ml, . . . ,  b'mk )  such  that  (a) 
v  =  o1  ejbj  +  (m.b'm,  (b)  | b'mi\  <  1,  and  (c)  at  least  m  of  b'ml, . . . ,  b’mk  are  standard. 

For  the  base  case  (where  m  =  0),  just  take  b'0  =  v  and  eo  =  1.  For  the  inductive 
step,  suppose  that  0  <  m  <  k.  If  b'm  is  standard,  then  take  bm  =  b’m ,  bm+  i  =  0,  and 
em+i  =  0.  Otherwise,  let  bm  =  st(b and  let  b"n+1  =  b'm-  bm.  Let  e'  =  max{|6"m+1)i|  : 

i  =  1  ,...,k}.  Since  not  all  components  of  b'm  are  standard,  e'  >  0.  Note  that,  by 
construction,  st(e'/bm f)  =  0  if  bmi  ^  0,  for  i  =  1  ,...,k.  Let  b'm+1  =  b'^l+1/e'  and  let 
em+i  =  e'em.  By  construction,  \b\m+V)i\  <  1  and  at  least  one  component  of  b'm+l  is  either 
1  or  —1.  Moreover,  if  b'mi  is  standard,  then  b"m+1 v  =  b'(m+1yi  =  0.  Thus,  b'm+l  has  at 

least  one  more  standard  component  that  b'm.  Since  clearly  v  =  YffLo  £jbj  +  em+i b'm+1,  this 
completes  the  inductive  step.  The  lemma  follows  immediately.  I 

Returning  to  the  proof  of  Theorem  4.5,  I  next  prove  by  induction  on  m  that  for 
all  m  <  k!  (where  k!  <  k  is  as  in  Lemma  A. 8),  there  exist  standard  probability 
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measures  /i0,  •  •  •  (standard)  vectors  bm+ 1, . . . ,  bk>  G  and  ei, . . . ,  e*,/  such  that 
"  =  ££=0  *03  +  Djtn+l  6 A- 

The  base  case  is  immediate  from  Lemma  A. 8:  taking  b3 ,  j  =  1, . . . ,  k'  as  in  Lemma  A. 8, 
b0  is  in  fact  a  probability  measure  since  b0  =  st[y).  Suppose  that  the  result  holds  for 
m.  Consider  bm+ \ .  If  6(m+i)i  <  0  for  some  j  then,  since  ufUf)  >  0,  there  must  ex¬ 
ist  j'  G  {1  such  that  /iy(f/j)  >  0.  Thus,  there  exists  some  A  >  0  such  that 

N(h,(Ut))  +  b,  m+i )i  >  0.  Since  there  are  only  finitely  many  basic  elements  and  every 
element  in  the  vector  p3  is  nonnegative,  for  j  =  0, . . . ,  m,  there  must  exist  some  N'  such 
that  b’m+1  =  H - b  fJ,m)  +  bm+i  >  0.  Let  c  =  £?=i  b[m+1)i,  and  let  fj,m+1  =  b'm+1/c. 

Clearly,  v  (eo  N  em_|_i)/io  ~b  (fm  A^  £m+i) /-^m  T  ccm_ T  ^2j=m+2  bj-  This 
completes  the  proof  of  the  inductive  step. 

The  theorem  now  immediately  follows.  | 

Proposition  4.6:  Every  LPS  over  (W,  T)  is  equivalent  to  an  LPS  over  (W,  T)  of  length 
at  most  \E\. 

Proof:  The  argument  is  essentially  the  same  as  that  for  Proposition  4.4,  using  the 
observation  that  a  probability  measure  over  (W,  T)  can  be  identified  with  an  element  of 
IP}^]  the  vector  defining  the  probabilities  of  the  elements  in  T .  I  leave  details  to  the 
reader.  | 


Proposition  A. 9:  For  the  NPS  (IP,  E,  u)  constructed  in  Example  f.10,  there  is  no  LPS 
jl  over  (IP,  E)  such  that  v  ~  jl. 

Proof:  I  start  with  a  straightforward  lemma. 

Lemma  A.  10:  Given  an  LPS  jl,  there  is  an  LPS  jjf  such  that  jl  ~  jl'  and  all  the 
probability  measures  in  jl'  are  distinct. 

Proof:  Define  jl'  to  be  the  subsequence  consisting  of  all  the  distinct  probability  measures 
in  jl.  That  is,  suppose  that  jl  =  (no,  AO,  •  ■  •)■  Then  jl'  =  (nk0,  Uku  •  •  •),  where  ko  =  0, 
and,  if  ka  has  been  defined  for  all  a  <  /3  and  there  exists  an  index  7  such  that  pka  ^  /i7 
for  all  a  <  (3,  then  kp  is  the  least  index  5  such  that  nka  7^  hs  f°r  all  a  <  f3.  If  there  is  no 
index  7  such  that  /i7  f  {pka  :  a  <  /?} ,  then  jl'  =  (pka  :  a  <  (3).  I  leave  it  to  the  reader 
to  check  that  jl  ~  jl'  ■  I 

Returning  to  the  proof  of  Proposition  A. 9,  suppose  by  way  of  contradiction  that  v  ~  jl.. 
Without  loss  of  generality,  by  Lemma  A.  10,  assume  that  all  the  probability  measures  in 
jl  are  distinct.  Clearly  Eu(xw)  <  Eu(ax{wi})  if  ol  >  2  and  Eu{xw )  >  Ev(ax{w  1}) 
if  a  <  2.  Since  v  ~  /i,  it  must  be  the  case  that  Eyjxw)  <  Eji(aX{wi})  if  ol  >  2 
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and  Efi(xw)  >  Ep{ax{Wl})  if  ol  <  2.  Since  Ep(xw)  —  (1, 1,  •  • .),  it  follows  that  if 
ft  =  (no, /ii, . . .),  it  must  be  the  case  that  Ho(wi)  =  1/2  and 


hi M  >  1/2.  (3) 

Similar  arguments  (comparing  xw  to  X{wj})  can  be  used  to  show  that  Hoiyoj)  =  1/2J  and 
Hi(w2j-i)  >  l/2J  for  j  =  1,2,....  Next,  observe  that  Eu(x{Wl}~  22k~1X{w2k})  =  (2fc  +  l)e. 
Thus, 

K(xi„i  ~  =  S„(( 2*  +  1)(X(„)  -  (xw/m- 

It  follows  that  the  same  relationship  must  hold  if  v  is  replaced  by  ft.  That  is, 

hi(wi)  -  22k~1n1(w2k)  =  (2fc  +  l)(hi(wi)  -  (1/2)). 


Rearranging  terms,  this  gives 

2VrK)  +  22k~1ii1{w2k)  =  2k~x  +  1/2, 
or 

/iiK)  +  2k~lin{w2k)  =  1/2  +  l/2fc+1.  (4) 

Thus,  Hi{wi)  <  1/2  +  l/2fc+1  for  all  k  >  1.  Putting  this  together  with  (3),  it  follows  that 
Pi(w±)  =  1/2.  Plugging  this  into  (4)  gives  Hi(w2k)  =  l/22k.  It  now  follows  that  hi  =  ho, 
contradicting  the  choice  of  /X  I 

Theorem  5.1:  FN^P  is  a  bijection  from  NPS(hP,  T)/  —  to  Pop {W,  E)  and  from 
NPSC(W,  E)/  —  to  Popc(hF,  E). 

Proof:  As  I  said  in  the  main  text,  the  proof  that  ibv^p  is  an  injection  is  straightforward, 
and  to  prove  that  it  is  a  surjection  in  the  countably  additive  case,  it  suffices  to  show  that 
Fn^p(W,  E ,  u)  =  {XX,  E ,  E' ,  h),  where  v  «  ft'  and  fl'  is  the  countably  additive  SLPS  such 
that  Fs^p((W,  E,  ft'))  =  {XV,  E,  E' ,  h).  I  now  do  this. 

Suppose  that  FN^P(W,  E ,  u)  =  (XX,  E ,  E(,  hi).  First  I  show  that  u(U)  =  0  iff  ft'(U)  = 
0.  Let  A  =  xu  and  Y  =  y0.  Note  that  v(U)  =  0  iff  EV{X)  =  E„(Y)  iff  Ep(X)  =  Ep(Y) 
iff  pf(U)  =  0.  Thus,  E[  =  {U  :  v(U)  ±  0}  =  {U  :  ft'(U)  ±  0}  =  E' . 

Now  suppose  by  way  of  contradiction  that  h  /  hr  Thus,  there  must  exist  some 
V  G  E,  U  G  E'  such  that  pt{V  U)  ^  Hi(V  \  U).  Let  (3  be  the  smallest  ordinal  such  that 
Hp(U)  7^  0.  It  follows  that  \  U )  ^  st(v{V  \  U )).  We  can  assume  without  loss  of  gen¬ 
erality  that  lYpiy  |  U )  >  st{u{V  j  U )).  Choose  a  real  number  r  such  that  /i/ ( V  |  U)  >  r  > 
st{y(V  |  U)).  Then  Ep(xvnu)  >  Ejl'i^Xu)  but  Ev(xvnu)  <  K(rXu)-  This  contradicts 
the  assumption  that  fl1  ~  v.  It  follows  that  FN_+P(XX,  E,  u)  =  (XV,  E,  E' ,  n),  as  desired. 

It  remains  to  show  that  if  (W,  E,  E' ,  h)  G  Pop(XV,  E)  —  Popc(XV,  E),  then  there  is 
some  ( XV,  E ,  v)  G  NPS(XV ,  E)  such  that  FN^P(W,  Ev)  =  ( XV,  E ,  E' ,  pt).  My  proof  in  this 
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case  follows  closely  the  lines  of  an  analogous  result  proved  by  McGee  [1994].  1  provide 
the  details  here  mainly  for  completeness. 

The  proof  relies  on  the  following  ultrafilter  construction  of  non- Archimedean  fields. 
Given  a  set  S,  a  filter  Q  on  S'  is  a  nonempty  set  of  subsets  of  F  that  is  closed  under 
supersets  (so  that  if  U  G  Q  and  U  C  U',  then  U'  G  G),  is  closed  under  hnite  intersections 
(so  that  if  UiJJ-2  G  G,  then  U\  D  U2  G  G),  and  does  not  contain  0.  An  ultrafilter  is  a 
maximal  filter,  that  is,  a  filter  that  is  not  a  strict  subset  of  any  other  filter.  It  is  not  hard 
to  show  that  if  U  is  an  ultrafilter  on  S,  then  for  all  U  C  S,  either  U  G  U  or  U  G  14  [Bell 
and  Slomson  1974], 

Suppose  F  is  either  JR  or  a  non- Archimedean  field,  J  is  an  arbitrary  set,  and  U 
is  an  ultrafilter  on  J .  Define  an  equivalence  relation  on  FJ  by  taking  (a3  :  j  G 
J )  ( b3  :  j  G  J)  if  {j  :  a3  =  bfi}  G  U.  Similarly,  define  a  total  order  Fu  by  taking 

(aj  :  j  G  J)  -<u  ( bj  :  j  G  J)  if  {j  :  a3  <  bj}  G  U.  (The  fact  that  <u  is  total  uses  the  fact 
that  for  all  U  C  J,  either  U  G  14  or  U  Eli.  Note  that  the  pointwise  ordering  on  FJ  is 
not  total.)  Let  FJ /~u  consist  of  these  equivalence  classes.  Note  that  F  can  be  viewed 
as  a  subset  of  by  identifying  a  G  F  with  the  sequence  of  all  a’s. 

Define  addition  and  multiplication  on  FJ  pointwise,  so  that,  for  example,  (a3  :  j  G 
J )  +  (bj  :  j  G  J)  =  ( aj  +  bj  :  j  G  J).  It  is  easy  to  check  that  if  ( a3  :  j  G  J)  (a'3  :  j  G  J), 
then  (a3  :  j  G  J)  +  (bj  :  j  G  •/)  (a)  :  j  G  J)  +  (b3  :  j  G  J),  and  similarly  for 

multiplication.  Thus,  the  definitions  of  +  and  x  can  be  extended  in  the  obvious  way  to 
FJhu-  With  these  definitions,  it  is  easy  to  check  that  FJ  f~u  is  a  field  that  contains  F. 

Now  given  a  Popper  space  (W,  F,  J-' ,  n)  and  a  finite  subset  A  =  {Lfi, . . . ,  14}  C 
JP,  let  JP4  be  the  (finite)  algebra  generated  by  A  (that  is,  the  smallest  set  containing 
{Ui, . . . ,  Uk,  W}  that  is  closed  under  unions  and  complement).  Let  F'A  =  FA  fl  T' .  It 
follows  from  Theorem  3.1  that  there  is  a  finite  SLPS  fiA  over  (W,  Fjs)  that  is  mapped  to 
(W,  Fa,  F'a,,ha)  by  Fs^p.  (Although  Theorem  3.1  is  stated  for  finite  state  spaces  W,  the 
proof  relies  on  only  the  fact  that  the  algebra  is  finite,  so  it  applies  without  change  here.) 
It  now  follows  from  Theorem  4.5  that,  for  each  A,  there  is  a  nonstandard  probability 
space  (W,  Fa,  }Ja)  with  range  M(e)  that  is  equivalent  to  JIa-  By  Proposition  4.2,  it  follows 
that  for  U  G  F\  iff  v. a(U )  =  0.  Moreover,  st(i/A(V  \  U))  =  Ha(Y  I  U)  for  U  G  Ffi  and 
V  G  FA- 

Let  J  consist  of  all  finite  subsets  of  F.  For  a  subset  A  of  F ,  let  Ga  be  the  subset  of 
2J  consisting  of  all  sets  in  J  containing  A.  Let  Q  =  {G  C  J  :  G  D  Ga  for  some  A  C  F}. 
It  is  easy  to  check  that  G  is  a  filter  on  J.  It  is  a  standard  result  that  every  filter  can  be 
extended  to  an  ultrafilter  [Bell  and  Slomson  1974],  Let  U  be  an  ultrafilter  containing  G ■ 
By  the  construction  above,  1 Z(e)/~u  is  a  non- Archimedean  field. 

Define  v  on  (IF,  F)  by  taking  v(U)  =  (ua(U)  :  A  G  J),  where  ua(U)  is  taken  to  be  0 
if  U  ^  Fa-  To  see  that  v  is  indeed  a  nonstandard  probability  measure  with  the  required 
properties,  note  that  clearly  u(W)  =  1  (where  1  is  identified  with  the  sequence  of  all 
l’s).  Moreover,  to  see  that  u(U)  +  v(V)  =  u(U  U  V),  let  Au,v  be  the  smallest  subalgebra 
containing  U  and  V.  Note  that  if  A  D  Au,v,  then  ua(U)  +  ua(V)  =  va'(U  U  V).  Since 
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the  set  of  algebras  containing  Au,v  is  an  element  of  the  ultrafilter,  the  result  follows. 
Similar  arguments  show  that  v(U)  —  0  iff  U  G  T'  and  that  st(u( V  \  U))  =  p(V  \  U )  if 
U  G  T'  and  V  G  T .  Clearly  Fn^p(u)  —  fi.  I 

Proposition  5.2:  If  U\  «  u2  then  V\  ~  v2. 

Proof:  Suppose  that  v\  ~  u2.  To  show  that  v\  ~  u2,  first  suppose  that  v\ (U)  p  0 
for  some  U  C  W.  Then  EVl(x<b)  <  EVl (xu)-  Since  v\  ~  u2,  it  must  be  the  case  that 
EU2 {xi)  <  EU2(Xu).  Thus,  u2(U)  p  0.  A  symmetric  argument  shows  that  if  u2(U)  p  0 
then  ui (U)  p  0.  Next,  suppose  that  v\(U)  p  0  and  V\  (V  \  U)  —  a.  Thus,  Eui(axu )  = 
Eui{xunv)-  Since  U\  «  v2)  it  follows  that  EV2{axu )  =  EU2(xunv),  and  so  u2(V  \U)  —  a. 
Thus,  st(ui(V  |  U))  =  st{v2(V  |  U)).  Hence,  v\  ~  u2,  as  desired.  I 

Theorem  6.5:  There  exists  an  NPS  v  whose  range  is  an  elementary  extension  of  the 
reals  such  that  fi  ~  u  and  Ad, ... ,  Xn  are  independent  with  respect  to  u  iff  there  exists  a 
sequence  rJ ,  j  =  1,  2, . . .  of  vectors  in  (0,  l)fc  such  that  r5  — >  (0, . . . ,  0)  as  j  — >  oo,  and 
Ad, . . . ,  Xn  are  independent  with  respect  to  fin  E  for  j  =  1, 2,  3, . . .. 

Proof:  Suppose  that  there  exists  an  NPS  v  whose  range  is  an  elementary  extension  of  the 
reals,  fi  ~  u,  and  Ad, ... , Xn  are  independent  with  respect  to  v.  Using  arguments  similar 
in  spirit  to  those  the  arguments  of  BBD  [1991b,  Proposition  2],  it  follows  that  there  exist 
positive  infinitesimals  ei, . . . ,  such  that  fi  □  (ei, . . . ,  e*,)  =  v.  It  is  not  hard  to  show  that 
there  exists  a  finite  set  of  real-valued  polynomials  pi, . . .  ,Pn  such  that  Pj{e i, . . . ,  efi)  —  0 
for  j  =  1, . . . ,  N,  and  if  Pis  a  vector  of  positive  reals  such  that  Pj(r)  =  0  for  j  =  1, . . . ,  N, 
then  Ad, . . . ,  Xn  are  independent  with  respect  to  fi  □  r.  Thus,  for  all  natural  numbers 
m  >  1,  the  range  of  u  satisfies  the  first-order  property 


3xi . . .  3xk(pi(xi, . . . ,  Xk)  =  0A. .  .Apn(xi,  ■  ■  ■ ,  Xk)  =  0A0  <  x\  <  1/mA. .  .A0  <  Xk  <  1/m). 


Since  the  range  of  v  is  an  elementary  extension  of  the  reals,  this  first-order  property 
holds  of  the  reals  as  well.  Thus,  there  exists  a  sequence  r5  of  vectors  of  positive  reals 
converging  to  0  such  that  Pj(f 5)  =  0  for  j  =  1, . . . ,  N. 

The  converse  follows  by  a  straightforward  application  of  compactness  in  first-order 
logic  [Enderton  1972],  Suppose  that  there  exists  a  sequence  r5,  j  =  1,  2, ...  of  vectors  in 
(0,  T)k  such  that  P  — >  (0, . . . ,  0)  as  j  — >  oo,  and  Ad, ... ,  Xn  are  independent  with  respect 
to  fin  P  for  j  =  1,2,3, .. ..  We  now  apply  the  compactness  theorem.  As  I  mentioned 
in  the  proof  of  Proposition  4.6,  the  compactness  theorem  says  that,  given  a  collection 
for  formulas,  if  each  finite  subset  has  a  model,  then  so  does  the  whole  set.  Consider  a 
language  with  the  function  symbols  +  and  x,  the  binary  relation  <,  a  constant  symbol 
r  for  each  real  number  r,  a  unary  predicate  N  (representing  the  natural  numbers),  and 
constant  symbols  pv  for  each  set  U  G  IF.  Intuitively,  pu  represents  v(U).  Consider  the 
following  (uncountable)  collection  of  formulas: 
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(a)  All  first-order  formulas  in  this  language  true  of  the  reals.  (This  includes,  for  exam¬ 
ple,  a  formula  such  as  \/x\/y(x+y  =  y+x),  which  says  that  addition  is  commutative, 
as  well  as  formulas  such  as  2  +  3  =  5  and  y/2  x  VS  =  n/6.) 

(b)  Formulas  pu  >  0  for  U  e  X'  and  pu  —  0  for  U  e  X  —  X’ . 

(c)  Formulas  Pu  +  Pv  =  Puuv  if  U  fl  V  —  0. 

(d)  The  formula  pw  =  1. 

(e)  Formulas  of  the  form  px  1=X1  x  •  •  •  x  pXn=Xn  =  Px1=xln...nxn=Xn,  for  all  values  xt  e 

V(Xi),  i  =  1 ,  these  formulas  say  that  X\, ,  Xn  are  independent  with 

respect  to  v. 

(f)  For  every  pair  of  Y,  Y’  of  random  variables  such  that  E^{Y)  >  E^(Y'),  a  formula 
that  says  EV{Y )  >  EU{Y '),  where  EV{Y)  and  EV(Y')  are  expressed  using  the  con¬ 
stant  symbols  pu  (where  the  events  U  are  those  of  the  form  Y  =  y  and  Y'  =  y'). 
Note  that  this  formula  is  finite,  since  X  and  Y  are  assumed  to  have  finite  range. 
The  formula  would  not  be  expressible  in  first-order  logic  if  X  or  Y  had  infinite 
range. 

It  is  not  hard  to  show  that  every  finite  subset  of  these  formulas  is  satisfiable.  In¬ 
deed,  given  a  finite  subset  of  formulas,  there  must  exist  some  m  such  that  taking 
Pi j  =  p\I\rm(U)  will  work  (and  interpreting  r  as  the  real  number  r,  of  course).  The 
only  nonobvious  part  is  showing  that  we  can  deal  with  the  formulas  in  part  (f);  that 
we  can  do  so  follows  from  the  proof  of  Proposition  1  in  [Blume,  Brandenburger,  and 
Dekcl  1991b],  which  shows  that  Ey(Y')  >  Ep(Y)  iff  there  exists  some  M  such  that 
Epni*n(Y')  >  Ej2nr^(Y)  for  all  m,  then  Ey(Y'r)  >  Ey(Y). 

Since  every  finite  set  of  formulas  is  satisfiable,  by  compactness,  the  infinite  set  is 
satisfiable.  Let  v(U)  be  the  interpretation  of  pu  in  a  model  satisfying  these  formulas. 
Then  it  is  easy  to  check  that  v  is  an  elementary  extension  of  the  reals,  v  ~  /!,  and  that 
Ad, ... ,  Xn  are  independent  with  respect  to  v.  | 

Theorem  6.6:  X\, . . . ,  Xn  are  strongly  independent  with  respect  to  the  Popper  space 

(W,  X ,  X' ,  pi)  iff  there  exists  an  NPS  (W,X,u)  such  that  Fx^p(W,  X,v)  =  (W,X,X',p) 
and  Ad, ... ,  Xn  are  independent  with  respect  to  (IF,  X,  v). 

Proof:  It  easily  follows  from  Kohlberg  and  Reny’s  [1997,  Theorem  2.10]  characterization 
of  strong  independence  that  if  Xlf . . . ,  Xn  are  independent  with  respect  to  the  NPS 
(IF,  X,  u)  then  Ad, . . . ,  Xn  are  strongly  independent  with  respect  to  Fv-^p(I'F  X,  u). 

The  converse  follows  using  compactness,  much  as  in  the  proof  of  Theorem  6.5.  Sup¬ 
pose  that  (IF,  X,  X',p)  is  a  Popper  space  and  pj  — >  p  are  as  required  for  Ad, . . . ,  Xn  to 
be  strongly  independent  with  respect  to  p.  Consider  the  same  language  as  in  the  proof 
of  Theorem  6.5,  and  essentially  the  same  collection  of  formulas,  except  that  the  formulas 
of  part  (f)  are  replaced  by 
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(f)  Formulas  of  the  form  (r  —  ^)pv  <  Punv  <  (r  +  ^)pv  for  all  U ,  V,  r,  and  n  >  0 
such  that  ju(U  |  V)  =  r. 

Again,  it  is  easy  to  see  that  every  finite  subset  of  these  formulas  is  satishablc.  Indeed, 
given  a  finite  subset  of  formulas,  there  must  exist  some  m  such  that  taking  pv  =  pm(U) 
satisfies  all  the  formulas  (and  interpreting  r  as  the  real  number  r,  of  course).  By  com¬ 
pactness,  the  infinite  set  is  satishable.  Let  v(U)  be  the  interpretation  of  pu  in  a  model 
satisfying  these  formulas.  Then  it  is  easy  to  check  that  Fl^n(W,  F,  u)  =  (W,  IF,  F' ,  p), 
and  that  X\, . . . , Xn  are  independent  with  respect  to  v.  I 
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